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PREFACE TO THE FOURTH EDITION 


Apart from the provision of an index of names, the main changes in 
this edition are in the Notes at the end of each chapter. These have 
been revised to include references to results published since the third 
edition went to press and to correct omissions. There are simpler 
proofs of Theorems 234, 352, and 357 and a new Theorem 272. The 
Postscript to the third edition now takes its proper place as part of 
Chapter XX. I am indebted to several correspondents who suggested 
improvements and corrections. 

I have to thank Dr. Ponting for again reading the proofs and Mrs. 


V. N. R. Milne for compiling the index of names. 
E. M. W. 


ABERDEEN 
July 1959 


PREFACE TO THE FIRST EDITION 


Tuts book has developed gradually from lectures delivered in a number 
of universities during the last ten years, and, like many books which 
have grown out of lectures, it has no very definite plan. 

It is not in any sense (as an expert can see by reading the table of 
contents) a systematic treatise on the theory of numbers. It does not 
even contain a fully reasoned account of any one side of that many- 
sided theory, but is an introduction, or a series of introductions, to 
almost all of these sides in turn. We say something about each of a 
number of subjects which are not usually combined in a single volume, 
and about some which are not always regarded as forming part of the 
theory of numbers at all. Thus Chs. XII-XV belong to the ‘algebraic’ 
theory of numbers, Chs. XIX-XXI to the ‘additive’, and Ch. XXII 
to the ‘analytic’ theories; while Chs. III, XI, XXIII, and XXIV deal 
with matters usually classified under the headings of ‘geometry of 
numbers’ or ‘Diophantine approximation’. There is plenty of variety 
in our programme, but very little depth; it is impossible, in 400 pages, 
to treat any of these many topics at all profoundly. 

There are large gaps in the book which will be noticed at once by any 
expert. The most conspicuous is the omission of any account of the 
theory of quadratic forms. This theory has been developed more 
systematically than any other part of the theory of numbers, and there 
are good discussions of it in easily accessible books. We had to omit 
something, and this seemed to us the part of the theory where we had 
the least to add to existing accounts. 

We have often allowed our personal interests to decide our pro- 
gramme, and have selected subjects less because of their importance 
(though most of them are important enough) than because we found 
them congenial and because other writers have left us something to 
say. Our first aim has been to write an interesting book, and one unlike 
other books. We may have succeeded at the price of too much eccen- 
tricity, or we may have failed; but we can hardly have failed com- 
pletely, the subject-matter being so attractive that only extravagant 
incompetence could make it dull. 

The book is written for mathematicians, but it does not demand any 
great mathematical knowledge or technique. In the first eighteen 
chapters we assume nothing that is not commonly taught in schools, 
and any intelligent university student should find them comparatively 
easy reading. The last six are more difficult, and in them we presuppose 
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a little more, but nothing beyond the content of the simpler. university 
courses. 

The title is the same as that of a very well-known book by Professor 
L. E. Dickson (with which ours has little in common). We proposed 
at one time to change it to An introduction to arithmetic, a more novel 
and in some ways a more appropriate title; but it was pointed out that 
this might lead to misunderstandings about the content of the book. 

A number of friends have helped us in the preparation of the book. 
Dr. H. Heilbronn has read all of it both in manuseript and in print, 
and his criticisms and suggestions have led to many very substantial 
improvements, the most important of which are acknowledged in the 
text. Dr. H. S. A. Potter and Dr. S. Wylie have read the proofs and 
helped us to remove many errors and obscurities. They have also 
checked most of the references to the literature in the notes at the ends 
of the chapters. Dr. H. Davenport and Dr. R. Rado have also read 
parts of the book, and in particular the last chapter, which, after their 
suggestions and Dr. Heilbronn’s, bears very little resemblance to the 
original draft. 

We have borrowed freely from the other books which are catalogued 
on pp. 414-15, and especially from those of Landau and Perron. To 
Landau in particular we, in common with all serious students of the 
theory of numbers, owe a debt which we could hardly overstate. 

= GHH. 
OXFORD E. M. W. 
August 1938 


REMARKS ON NOTATION 


We borrow four symbols from formal logic, viz. 


—, =, J, €. 


>=) 


— is to be read as ‘implies’. Thus 
L|[m—>lin (p. 2) 
means ‘ *‘l is a divisor of m’’ implies ‘‘ is a divisor of n’’’, or, what is 
the same thing, ‘if / divides m then J divides n’; and 
bla.c|b>cla (p. 1) 
means ‘if b divides a and c divides b then c divides a’. 
= is to be read ‘is equivalent to’. Thus 


m | ka—ka' = m | a—a’ (p. 51) 
means that the assertions ‘m divides ka—ka’’ 
are equivalent; either implies the other. 

These two symbols must be distinguished carefully from —> (tends to) 
and = (is congruent to). There can hardly be any misunderstanding, 
since — and = are always relations between propositions. 

J is to be read as ‘there is an’. Thus 

al. l1<l<m.ll|m (p. 2) 
means ‘there is an / such that (i) 1 < ¿l < m and (ii) l divides m’. 
€ is the relation of a member of a class to the class. Thus 
meS.neS—+(mtinjeSs (p. 19) 
means ‘if m and n are members of S then m-+-n and m—n are members 
of S’. 

A star affixed to the number of a theorem (e.g. Theorem 15*) means 
that the proof of the theorem is too difficult to be included in the book. 
It is not affixed to theorems which are not proved but may be proved 
by arguments similar to those used in the text. 


and ‘m, divides a—a’’ 
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1 
THE SERIES OF PRIMES (1) 


1.1. Divisibility of integers. The numbers 
ne —%,—2, -1, 0, 1, 2 pa 
are called the rational integers, or simply the integers; the numbers 
0, 1, 2, 3 pe 
the non-negatiee integers; and the numbers 
1, 2, 3,... 

the positive integers. The positive integers form the primary subject- 
matter of arithmetic, but it is often essential to regard them as a şub- 
class of the integers or of some larger class of numbers. 

In what follows the letters 

a, b, Ny, Doing By Pia, 

will usually denote integers, which will sometimes, but not always, be 
subject to further restrictions, such as to be positive or non-negative. 
We shall often use the word ‘number’ as meaning ‘integer’ (or ‘positive 
integer’, etc.), when it is clear from the context that we are considering 
only numbers of this particular class. 

An integer a is said to be divisible by another integer b, not 0, if 
there is a third integer c such that 

a = be. 

If a and b are positive, c is necessarily positive. We express the fact 
that a is divisible by b, or b is a divisor of a, by 


bja. 

Thus lla, ala; 

and b 0for every b but 0. We shall also sometimes use 
bla 


to express the contrary of b a. It is plain that 
bla.cjb > cla, 
bla — bc|ac 
if c 40, and cja :c|b > c|ma+nb 
for all integral m and n. 
1.2. prime numbers. In this section and until § 2.9 the numbers 


considered are generally positive integers.? Among the positive integers 
+ There are occasional exceptions, asin §§ 1.7, where e7 is the exponential function of 
analysis. 
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2 THE SERIES OF PRIMES [Chap. 1 


there is a sub-class of peculiar importance, the class of primes. A num- 
ber p is said to be prime if 

(i) p >], 

Gi) p has no positive divisors except 1 and p. 
For example, 37 is a prime. It is important to observe that 1 is not 
reckoned as a prime. In this and the next chapter we reserve the letter 
p for primes.? 

A number greater than 1 and not prime is called composite. 

Our first theorem is 

Teorem 1. Every positive integer, except 1,is a product of primes. 

Either » is prime, when there is nothing to prove, or n has divisors 
between 1 and n. If m is the least of these divisors, m is prime; for 
oleae W.1<l<m. lm; 
and lim => ljn, 
which contradicts the definition of m. 

Hence n is prime or divisible by a prime less than n, say p4, in which 


case 
N= PyNy, l< n, <n. 


Here either n is prime, in which case the proof is completed, or it is 
divisible by a prime p, less than n,, in which case 

Nn = Py ny = Py Pee 1 < Ny < Ni < n. 
Repeating the argument, we obtain a sequence of decreasing numbers 
N, Ny yes Ng- all greater than 1, for each of which the same alterna- 


tive presents itself. Sooner or later we must accept the first alternative, 
that n,_, is a prime, say pp, and then 


(12.1) N = “Py Pq Dp 
Thus 666 = 2.3.3.37. 

If ab =: n, then a and b cannot both exceed Vn, Hence any composite 
n is divisible by a prime p which does not exceed vn. 

The primes in (1.2.1) are not necessarily distinct, nor arranged in 
any particular order. If we arrange them in increasing order, associate 
sets of equal primes into single factors, and change the notation appro- 
priately, we obtain 
(1.2.2) n= piipg?... pee (a, > 0,a,>0,..., P LPL) 

We then say that n is expressed in standard form. 


f It would be inconvenient to have to observe this convention rigidly throughout 
the book, and we often depart from it. In Ch. IX, for exemple, we use p/q for ẹ typical 
rational fraction, and p is not usually prime. But p is the ‘natural’ letter for a prime, 
and we give it preference when we can conveniently. 


1.3(2-3)] THE SERIES OF PRIMES 3 


13. Statement of the fundamental theorem of arithmetic. 
There is nothing in the proof of Theorem 1 to show that (1.2.2) is a 
unique expression of n, or, what is the same thing, that (1.2.1) is unique 
except for possible rearrangement of the factors; but consideratian of 
special cases at once suggests that this is true. 

Turorm 2 (THE FUNDAMENTAL THEOREM OF ARITHMETIC). The 
standard form of n is unique; apart from rearrangement of factors, n can be 
expressed as a product of primes in one way only. 

Theorem 2 is the foundation of systematic arithmetic, but we shall 
not use it in this chapter, and defer the proof to § 2.10. It is however 
convenient to prove at once that it is a corollary of the simpler theorem 
which follows. 

THEOREM 3 (EUCLID'S FIRST THEOREM). Jf p is prime, and p|ab, 
thenp aor p b. 

We take this theorem for granted for the moment and deduce 
Theorem 2. The proof of Theorem 2 is then reduced to that of Theorem 
3, which is given in § 2.10. 

It is an obvious corollary of Theorem 3 that 

p|abc...l >» plaorp|borp|c...or pl, 
and in particular that, if a, b ,...,2 are primes, then p is one of a, b ,..., 1. 
Suppose now that 

n= pe py... pe = ght qg.. q, 
each product being a product of primes in standard form. Then 
Dy g.g for every 1, SO that every p is a q; and similarly every q 
is a p. Hence k = j and, since both sets are arranged in increasing 
order, p; = gq; for every i. 
If a; > b;, and we divide by pi, we obtain 
pipi. pë = pi. pipip 

The left-hand side is divisible by p,, while the right-hand side is not; 
a contradiction. Similarly b; > a, yields a contradiction. It follows 
that a, = b;, and this completes the proof of Theorem 2. 

It will now be obvious why 1 should not be counted as a prime. If 
it were, Theorem 2 would be false, since we could insert any number 
of unit factors. 


1.4. The sequence of primes. The first primes are 
2, 3, 5, 7, 11, 18, 17, 19, 28, 29, 31, 37, 41, 43, 47, 53 pe.. 


It is easy to construct a table of primes, up to a moderate limit V, by a 
procedure known as the ‘sieve of Eratosthenes’. We have seen that 


\ 
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ifn <N, and n is not prime, then n must be divisible by a prime not 
greater than VN, We now write down the numbers 


2, 3, 4,5, 6 poa N 
and strike out successively 


(i) 4, 6, 8, 10 ,..., i.e. 22 and then every even number, 
(ii) 9, 15, 21, 27 ,.,., ie. 3? and then every multiple of 3 not yet struck 
out, 


(iii) 25, 35, 55, 65 ,,..,ie. 5%, the square of the next remaining number 
after 3, and then every multiple of 5 not yet struck out,.... 


We continue the process until the next remaining number, after that 
whose multiples were cancelled last, is greater than VV. The numbers 
which remain are primes. All the present tables of primes have been 
constructed by modifications of this procedure. 

The tables indicate that the series of primes is infinite. They are 
complete up to 11,000,000; the total number of primes below 10 million 
is 664,579; and the number between 9,900,000 and 10,000,000 is 6,134. 
The total number of primes below 1,000,000,000 is 50,847,478; these 
primes are not known individually. A number of very large primes, 
mostly of the form 2?—1 (see the note at the end of the chapter), are 
also known ; the largest found so far has nearly 700 digits. 

These data suggest the theorem 


Torm «= 4. (EUCLID’S seco THeormm). The number of primes is 
infinite. 


We shall prove this in § 2.1. 

The ‘average’ distribution of the primes is very regular; its density 
shows a steady but slow decrease. The numbers of primes in the first 
five blocks of 1,000 numbers are 


168, 135, 127, 120, 119, 
and those in the last five blocks of 1,000 below 10,000,000 are 
62, 58, 67, 64, 53. 
The last 53 primes are divided into sets of 
5, 4, 7, 4, 6, 3, 6, 4, 5, 9 


in the ten hundreds of the thousand. 
On the other hand the distribution of the primes in detail is extremely 
irregular. 
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In the first place, the tables show at intervals long blocks of com- 
posite numbers. Thus the prime 370,261 is followed by 111 composite 


numbers. It is easy to see that these long blocks must occur. Suppose 
that 2, 3, 5ye P 

are the primes up to p. Then all numbers up to p are divisible by one 
of these primes, and therefore, if 


2.3.5...p = q, 
all of the p- 1 numbers 


q+2, q+3, q+4,..., g+p 


are composite. If Theorem 4 is true, then p can be aslarge as we please; 
and otherwise all numbers from some point on are composite. 


THEOREM 5. There are blocks of consecutive composite numbers whose 
length cxceeds any given number N. 


On the other hand, the tables indicate the indefinite persistence of 
prime-pairs, such as 3, 5 or 101, 103, differing by 2. There are 1,224 
such pairs (p,p+2) below 100,000, and 8,169 below 1,000,000, The 
evidence, when examined in detail, appears to justify the conjecture 


There are infinitely many prime-pairs (p, p+2). 


It is indeed reasonable to conjecture more. The numbers p, p+ 2, 
p+4 cannot all be prime, sirice one of them must be divisible by 3; 
but there is no obvious reason why p, p+2, p+6 should not all be 
prime, and the evidence indicates that such prime-triplets also persist 
indefinitely. Similarly, it appears that triplets (p, » +4, p+6) persist in- 
definitely. We are therefore led to the conjecture 


There are infinitely many prime-triplets of the types (p, p+ 2, p + 6) and 
(p,p+4,pt+8). 


Such conjectures, with larger sets of primes, may be multiplied, but 
ther proof or disproof is at present beyond the resources of mathematics. 


1.5. Some questions concerning primes. What are the natural 
questions to ask about a sequence of numbers such as the primes ? We 
have suggested some already, and we now ask some more. 


a) Is there a simple general formula for the n-th prime p, (a formula, 
that is to say, by which we can calculate the value of p, for any given 
n without previous knowledge of its value) ? No such formula is known. 
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Indeed it is unlikely that such a formula is possible, for the distribution 
of the primes is quite unlike what we should expect on any such 
hypothesis. 

On the other hand, it is possible to devise a number of ‘formulae’ 
for p, which are, from our point of view, no more than curiosities. 
Such a formula essentially defines p, in terms of itself, and no previously 
unknown p, can be calculated from it. We give an example in Theorem 
419 of Ch. XXII. 

Similar remarks apply to another question of the same kind, viz. 


(2) is there a general formula for the prime which follows a given prime 
(i.e. a recurrence formula such as p,,, = p2+2) ? 


Another natural question is 


(3) ts there a rule by which, given any prime p, we can find a larger 
prime q? 

This question of course presupposes that, as stated in Theorem 4, the 
number of primes is infinite. It would be answered in the affirmative if 
any simple function fín) were known which assumed prime values for all 
integral values of n. Apart from trivial curiosities of the kind already 
mentioned, no such function is known. The only plausible conjecture 
concerning the form of such a function was made by Fermat,f and 
Fermat’s conjecture was false. 

Our next question is 


(4) how many primes are there less than a given number x ? 


This question is a much more profitable one, but it requires careful 
interpretation. Suppose that, as is usual, we define 


n(x) 


to be the number of primes which do not exceed v, so that (1) = 0, 
n(2) = 1, 7(20) = 8. Ifp,is the nth prime then 


n(Pn) = đ, 
so that 7(x), as function of x, and ,, as function of n, are inverse 
functions. To ask for an exact formula for 7(x), of any simple type, is 
therefore practically to repeat question (1). 

We must therefore interpret the question differently, and ask ‘about 
how many primes ...?’ Are most numbers primes, or only a small 
proportion ? Is there any simple function f (x) which is ‘a good measure’ 
of a(x)? 

f See § 2.5. 
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We answer these questions in § 1.8 and Ch. XXII. 


1.6. Some notations. We shall often use the symbols 
(1.6.1) 0, 0, ~, 
and occasionally 


(1.6.2) <, >, Z. 


These symbols are defined as follows. 

Suppose that n is an integral variable which tends to infinity, and x a 
continuous variable which tends to infinity or to zero or to some other 
limiting value; that ¢(n) or ¢(x)is a positive function of n or x; and 
that f(n) or f(x)is any other function of n or x. Then 

(i) f = Ol$) means thatt |f < Ad, 
where A is independent of n or z, for all values of n or x in question; 


(ii) f = o(f) means that f/ẹ >00; 


and 
(iii) f ~ ẹ means that fi +1. 
anus 10x =O),  sinx = 00), x= O(2%), 
x = 0(x?), sinx = 0(x), zt+l~az, 


where x > œ, and 
x? = OQ), x2 = 0(x), sinx ~ 2, l+r~ 1l, 


when x — 0. It is to be observed that f = o($) implies, and is stronger 
than, f = O(4). 
As regards the symbols (1.6.2), 


(iv) f < ġ means f/¢ > 0, and is equivalent to f == o(¢); 

(v) f > $ means f/¢ > 00; 

(vi) f = means A¢< f < Ad, 
where the two A’s (which are naturally not the same) are both positive 
and independent of n or x. Thus f = 4¢ asserts that ‘f is of the same 
order of magnitude as ¢’. 

We shall very often use A as in (vi), viz. as an unspecified positive 
constant. Different A’s have usually different values, even when they 
occur in the same formula; and, even when definite values can be 
assigned to them, these values are irrelevant to the argument. 

So far we have defined (for example) ‘f = O(), but not ‘O( 1) in 
isolation; and it is convenient to make ọur notations more elastic, We 


} f| denotes, as usually in analysis, the modulus or absolute value off. 


8 THE SERIES OF PRIMES [Chep. 1 


agree that ‘O(¢)’ denotes an unspecified f such that f = O(¢). We can 
then write, for example, 

O(1) + 0(1) = O(1) = 0(5) 
when a > œ, meaning by this ‘iff = O(1)andg = O(1)thenf+g= O(1) 
and a fortiori f+g = o(x). Or again we may write 

Š 00) = O(n), 

meaning by this that the sum of n terms, each numerically less than a 
constant, is numerically less than a constant multiple of n. 

It is to be observed that the relation ‘=’, asserted between 0 or ọ 
symbols, is not usually symmetrical. Thus o0(1) = O(1) is always true; 
but 0( 1) == 0(1) is usually false. We may also observe that f ~ ¢ is 
equivalent to f = ¢-+0(¢) or to 


f= o{1+o(1)}. 
In these circumstances we say that f and ¢ are asymptotically equivalent, 
or that f is asymptotic to œ. 

There is another phrase which it is convenient to define here. Suppose 
that P is a possible property of a positive integer, and P(x) the number 
of numbers less than x which possess the property P. If 

P(t) ~ z, 
when x -> œ, i.e. if the number of numbers less than x which do not 
possess the property is o(x), then we say that almost all numbers possess 


the property. Thus we shall see} that n(x) = o(x), so that almost all 
numbers are composite. 


17. The logarithmic function. The theory of the distribution 
of primes demands a knowledge of the properties of the logarithmic 
function logx. We take the ordinary analytic theory of logarithms and 
exponentials for granted, but it is important to lay stress on one 
property of log x.t 


Since ee = Ite... +5 a 


gr +1 


mp” 


~nor > r 
ee ee 
when x -> œ. Hence e* tends to infinity more rapidly than any power 
of x. It follows that logx, the inverse function, tends to infinity more 
+ This follows at once from Theorem 7. 


tlog x is, of course, the ‘Napierian’ logarithm of x, to base e, ‘Common’ logarithms 
have no mathematical interest. 
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slowly than any positive power of x; logx -> œ, but 


log x 


(1.7.1) x 


> 0, 


or log x = o(x5), for every positive §, Similarly, loglog x tends to infinity 
more slowly than any power of logx. 

We may give a numerical illustration of the slowness of the growth 
or logx. If x = 10° = 1,000,000,000 then 

logx = 20-72... . 

Since e? = 20-08..., loglogx is a little greater than 3, and logloglogx a 
little greater than 1. If x = 10190, logloglogx is a little greater than 2. 
In spite of this, the ‘order of infinity of logloglogx has been made to 
play a part in the theory of primes. 

The function i 

log x 

is particularly important in the theory of primes. It tends to infinity 
more slowly than x but, in virtue of (1.7.1), more rapidly than g!-°, 
i.e. than any power of x lower than the first; and it is the simplest 
function which has this property. 


1.8. Statement of the prime number theorem. After this preface 
we Can state the theorem which answers question (4) of § 1.5. 


THEOREM 6 (THE PRIME NUMBER THEOREM) . The number of primes 
not exceeding x is asymptotic to xjlogx: 


a(x) ww 


x 
log x’ 

This theorem is the central theorem in the theory of the distribution 
of primes. We shall give a proof in Ch. XXII. This proof is not easy 
but, in the same chapter, we shall give a much simpler proof of the 
weaker 


Tueorem 7 (Tcuesycuer's ueorem). The order of magnitude of 
?T(x) is x/logx: 
n(x) = 


Tog x 
It is interesting to compare Theorem 6 with the evidence of the tables. 
The values of n(x) for x = 108, x = 108, and x = 10° are 
168, 78,498, 50,847,478; 
and the values of x/log x, to the nearest integer, are 
145, 72,382, 48,254,942, 
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The ratios are 1:159..> 108. > 1-053...; 


and show an approximation, though not a very rapid one, to 1. The 
excess of the actual over the approximate values can be accounted for 
by the general theory. 


x 
u y = logx 
then log y = log x-loglog x, 
and loglog x = o(log x), 
so that log y ~ logz, x = ylogx ~ ylogy. 


The function inverse to z/loga is therefore asymptotic to xlogx. 
From this remark we infer that Theorem 6 is equivalent to 


THEOREM 8: Dy ~ nilogn. 
Similarly, Theorem 7 is equivalent to 
THEOREM 9: Dn = nlogn. 


The 664,999th prime is 10,006,721; the reader shoulcl compare these 
figures with Theorem 8. 

We arrange what we have to say about primes and their distribution 
in three chapters. This introcluctory chapter contains little but defini- 
tions and preliminary explanations; we have proved nothing except the 
easy, though important, Theorem 1. In Ch. II we prove rather more : 
in particular, Euclid’s theorems 3 and 4. The first of these carries 
with it (as we saw in §1.3) the ‘fundamental theorem’ Theorem 2, on 
which almost all our later work depends; and we give two proofs in 
§§ 2.10-2,11, We prove Theorem 4 in §§ 2.1, 2.4, and 2.6, using several 
methocls, some of which enable us to develop the theorem a little further. 
Later, in Ch. XXII, we return to the theory of the distribution of primes, 
and clevelop it as far as is possible by elementary methods, proving, 
amongst other results, Theorem 7 and finally Theorem 6. 


NOTES ON CHAPTER 1 


§ 1.3. Theorem 3 is Euclid vii. 30. Theorem 2 does not seem to have been 
stated explicitly before Gauss (D.A., § 16). It was, of course, familiar to earlier 
mathematicians; but Gauss was the first to develop arithmetic as a systematic 
science. See also § 12.5. 

1.4. The best table of primes is D. N. Lehmer’s [jst of prime numbers from 1 
to 10,006,721 [Carnegie Institution, Washington, 165 (1914)]. The same author’s 
Factor table for the first ten millions [Carnegie Institution, Washington, 105 (1909)] 
gives tho smallest factor of all numbers up to ](,017,000 not divisible by 2, 3, 5, 
or 7. See also Liste des nombres premiers du onziéme million-(ed. Beeger, Amster- 
dam, 1951). Information about earlier tables will be found in the introductions 
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to Lehmer’s two volumes, and in Dickson’s H istory, i, ch. xiii. There are 
manuscript tables by Kulik in the possession of the Academy of Sciences of Vienna 
which extend up to 100,000,000, but which are, according to Lehmer, not accurate 
enough for publication. Our numbers of primes are less by 1 than Lehmer’s 
because he counts 1 as a prime. Mapes [Math. Computation 17 (1963), 184-5] gives 
a table of (x) for x any multiple of 10 million up to 1,000 million. 

A list of tables of primes with descriptive notes is given in D. H. Lehmer’s 
Guide to tables in the theory of numbers (Washington, 1941). 

Theorem 4 is Euclid ix. 20. 

For Theorem 5 see Lucas, Théorie des nombres, i (1891), 359-61. 

Kraitchik [Sphinz, 6 (1936), 166 and 8 (1938), 86] lists all primes betwetn 
10!2— 104 and 10!2-++-104, These lists contain 36 prime pairs (p,p + 2), of which the 


last is 1,000,000,009,649, 1,000,000,009,651. 
This seems to be the largest pair known. 

In § 22.20 we give a simple argument leading to a conjectural formula for the 
number of pairs (p, p + 2) below æ. This agrees well with the known facts, The 
method can be used to find many other conjectural theorems concerning pairs, 
triplets, and larger blocks of primes. 

§ 1.5. Our list of questions is modified from that given by Carmichael, Theory 
oj numbers, 29. 

§ 1.7. Littlewood’s proof that m(x) is sometimes greater than the ‘logarithm 
integral’ lix depends upon the largeness of logloglogz for large x. See Ingham, 
ch. v, or Landau, Vorlesungen, ii. 123-56. 

§ 1.8. Theorem 7 was proved by Tchebychef about 1850, and Theorem 6 by 
Hadamard and de la Vallée Poussin in 1896. See Ingham, 4-5; Landau, Hand- 
buch, 3-55; and Ch. XXII, especially the note to §§ 22.14-16. 


II 
THE SERIES OF PRIMES (2) 


2.1. First proof of Euclid’s second theorem. Euclid’s own proof 
of Theorem 4 was as follows. 

Let 2, 3, 5 ,..., p be the aggregate of primes up to p, and let 
(2.1.1) q = 2.3.5..p+1. 


Then q is not divisible by any of the numbers 2, 3, 5,..., p. It is there- 
fore either prime, or divisible by a prime between p and g. In either 
case there is a prime greater than p, which proves the theorem. 

The theorem is equivalent to 


(2.1.2) n(x) >œ. 


2.2. Further deductions from Euclid’s argument. If p is the 
nth prime Pp, and q is defined as in (2.1.1), it is plain that 
q< Prt! 
for n >l,f and so that Pny < Pratl. 
This inequality enables us to assign an upper limit to the rate of in- 


crease of p,, and a lower limit to that of r(x). 
We can, however, obtain better limits as follows. Suppose that 


(2.2.1) Dn < 2? 
for n = 1, 2,..., N. Then Euclid’s argument shows that 
(2.2.2) Prt X Papo pyl < ritet] < 27t, 
Since (2.2.1) is true for ņ = 1, it is true for all n. 
Suppose now that n = 4 and 
e< ace. 
Then er-1 > Qn ef) > 22". 
and so n(x) 2 ne) 2 n(2”) Sn, 
by (2.2.1). Since loglogx < n, we deduce that 
x(x) > loglogx 
for x > e®; and it is plain that the inequality holds also for 2 < x £ eë. 
We have therefore proved 
THEOREM 10: n(x) > loglogx (x > 2). 
We have thus gone beyond Theorem 4 and found a lower limit for 
f There is equality when 


n = 1, p =2, q = 3. 
} This is not true for n = 3. 
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the order of magnitude of m(x). The limit is of course an absurdly weak 
one, since for y = 10° it gives w(x) > 3, and the actual value of n(x) 
is over 50 million. 


2.3. Primes in certain arithmetical progressions. -Euclid’s 
argument may be developed in other directions. 

THEOREM 11. There are infinitely many primes of the form 4n+3. 

Define q by q = 27.3.5..9—1, 
instead of by (2.1 .1). Then q is of the form 4n-+ 3, and is not divisible 
by any of the primes up to p. It cannot be a product of primes 4n+1 
only, since the product of two numbers of this form is of the same form; 
and therefore it is divisible by a prime 4n +3, greater than p. 

THEOREM 12. There are infinitely many primes of the form 6n-L5. 

Thé proof is similar. We define q by 

q = 2.3.5..p—l, 

and observe that any prime number, except 2 or 3, is 6n-+lor 6n-+-5, 
and that the product of two numbers 6n-+1is of the same form. 

The progression 4n-+-lis more difficult, We must assume the truth 
of a theorem which we shall prove later (§ 20.3). 

THEOREM 13. Jf a and b have no common factor, then any odd prime 
divisor of a?-+-6? is of the form 4n-} 1. 

If we take this for granted, we can prove that there are infinitely 
many primes 4n-++ 1. In fact we can prove 


THEOREM 14. There are infinitely many primes of the form 8n-+-5. 


We take q = 32.52. 7*,..p?+ 2%, 
a sum of two squares which have no common factor. The square of an 
odd number 2m-+ 1 is 4m(m-+-1)+1 


and is 8y-+ 1, SO that q is 8n+5, Observing that, by Theorem 13, any 
prime factor of q is 4n+1, and so 8n+ 1 or 8n--5, and that the product 
of two numbers 8n+ lis of the same form, we can complete the proof 
as before. 

All these theorems are particular cases of a famous theorem of 
Dirichlet. 

THEOREM 15* (DIRICHLETS THEOREM).f If a is positive and a and b 
have no common divisor except 1, then there are infinitely many primes of 
the form an-+-b. 


+ An asterisk attached to the number of a theorem indicates that it jg not proved 
anywhere in the book. 
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The proof of this theorem is too difficult for insertion in this book. 
There are simpler proofs when bis 1 or — 1. 


2.4. Second proof of Euclid’s theorem. Our second proof of 


Theorem 4, which is due to Pólya, depends upon a property of what 
are called ‘Fermat’s numbers’. 
Fermat’s numbers are defined by 


F, = 2-41, 
so that F, = 5, Fy = 17, Fy = 257, F, = 65537. 
They are of great interest in many ways: for example, it was proved by 
Gaussf that, if F, is a prime p, then a regular polygon of p sides can 


be inscribed in a circle by Euclidean methods. 
The property of the Fermat numbers which is relevant here is 


THEOREM 16. No two Fermat numbers have a common divisor greater 
than 1. 


For suppose that F, and F, ,,, where k > 0, are two Fermat numbers, 
and that m|F., m| Frag: 
If x = 2%", we have 


F2 OP] Pa 


= = g*ř-1—g*-24...— 1, 
F, 2271 x+l1 ni 
and so F, F,,,—2. Hence 
m Fisk m [eee 


and therefore m 2. Since F, is odd, m = 1, which proves the theorem. 
It follows that each of the numbers F, F,,..., F, is divisible by an odd 


rn 


prime which does not divide any of the others; and therefore that there 
are at least n odd primes not exceeding F,,. This proves Euclid’s 


theorem. Also Pra < Fea 241, 


and it is plain that this inequality, which is a little stronger than (2.2.1), 
leads to a proof of Theorem 10. 


2.5. Fermat’s and Mersenne’s numbers. The first four Fermat 


numbers are prime, and Fermat conjectured that all were prime. Euler, 
however, found in 1732 that 


F, = 27417 = 641.6700417 
is composite. For 641 = 24454= 5.974] 


t Soe §5.8. 
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and so 
232 — 16,278 = (641—54)2% = 641m—(5.27)4 
= 641m—(641—1)* = 641n—, 
where m and ņ are integers. 
In 1880 Landry proved that 
R= 22 1 = 274177.67280421310721. 


More recent writers have proved that F, is composite for 
7 < n <16, n = 18, 19, 23, 36, 38, 39, 55, 63, 73 

and many larger values of n. Morehead and Western proved F; and F 
composite without determining a factor. No factor is known for F,, or 
for Fa but in al] the other cases proved to be composite a factor is known. 

No prime F, has been found beyond F, so that Fermat’s conjecture 
has not proved a very happy one. It is ‘perhaps more probable that the 
number of primes F, is finite. If this is So, then the number of primes 
2"-1-1is finite, since it is easy to prove 

THEOREM 17. lfa > 2 and q”-+ 1 is prime, then a is even andn = 2m, 


For if a is odd then @”-++ 1 is even; and if n has an odd factor k and 
n = kl, then a” + 1 is divisible by 
qk +1 = 
d+” 
It is interesting to compare the fate of Fermat’s conjecture with that 
of another famous conjecture, concerning primes of the form 2”—1, 
We begin with another trivial theorem of much the same type as 
Theorem 17. 
THEOREM 18. /f n > 1 and a”— 1 is prime, then a = 2 and n is prime. 


alk gfk-2N ty, 


For if q > 2, then a-l |a"—1; and if a=2 and n= kl, then 
2k—l] 2—l. 

The problem of the primality of @—1 is thus reduced to that of 
the primality of 2P— 1. It was asserted by Mersenne in 1644 that 
+ This is what is suggested by considerations of probability. Assuming Theorem 7, 
one might argue roughly as follows. The probability that a number 7 is prime is gi 

most A 
logn’ 
and therefore the total expectation of Fermat primes is at most 


1 
7 Ca ee n 
A > (cacy ijl <ALL?" <A, 
This argument (apart from its general lack of precision) assumes that there are no 


special reasons why a Fermat number ahould be likely to be prime, while Theorems 16 
and 17 suggest that there are some. 
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M, = 2?— 1 is prime for 
p = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257, 

and composite for the other 44 values of p less than 257. The first 
mistake in Mersenne’s statement was found about 1886,+ when Pervusin 
and Seelhoff discovered that M,, is prime. Subsequently four further 
mistakes were found in Mersenne’s statement and it need no longer be 
taken seriously. In 1876 Lucas found a method for testing whether M, 
is prime and used it to prove M,,, prime. This remained the largest 
known prime until 1951, when, using different methods, Ferrier found 
a larger prime (using only a desk calculating machine) and Miller and 
Wheeler (using the EDSAC 1 electronic computer at Cambridge) found 
several large primes, of which the largest was 


180Mj2,+1, 
which is larger than Ferrier ’s. But Lucas’s test is particularly suitable for 
use on a binary digital computer and it has been applied by a succession 
of investigators (Lehmer and Robinson using the SWAC and Hurwitz 
and Selfridge using the IBM 7090, Riesel using the Swedish BESK, 
and Gillies using the ILLIAC II). As a result it is now known that 
M, is prime for 
p = 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 

127, 521, 607, 1279, 2203, 2281, 3217, 

4253, 4423, 9689, 9941, 11213, 41937 DV10| 
and composite for all other p < 12000. The largest known prime is thus 
Mirzs, a number of 3375 digits. 

We describe Lucas’s test in § 15.5 and give the test used by Miller 
and Wheeler in Theorem 10 1. 

The problem of Mersenne’s numbers is connected with that of ‘per- 
fect? numbers, which we shall consider in § 16.8. 

We return to this subject in § 6.15 and § 15.5. 


2.6. Third proof of Euclid’s theorem. Suppose that 2, 3,..., P; 
are the first j primes and let N(x) be the number of n not exceeding x 
which are not divisible by any prime p > p,. If wc express such an n 


in the form nc n? m, 
where m is ‘quadratfrer’, ie. is not divisible by the square of any prime, 
we have — Æ 3b pè 

m = 223". pj, 


with every b either 0 or 1. There are just 2’ possible choices of thc 

exponents and so not more than 2 different values of m. Again, 

nı <n < Vx and so there are not more than yz different values of n,. 
+ Euler stated in 1732 that M,, and M,, are prime, but this wag a mistake. 
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Hence 
(2.6.1) N(x) < vx, 
If Theorem 4 is false, so that the number of primes is finite, let the 
primes be 2, 3,..., p; In this case N(x) = x for every x and so 
x < 2INvz, x < 27), 
which is false for a > 27/+ 1. 
We can use this argument to prove two further results. 


THEOREM 19. The series 


Led 
(2.6.2) >see sty ae a +7 te 


is divergent. 


If the series is convergent, we can choosej SO that the remainder after 
j terms is less than }, ie. 


Pisi Pjr 2 
The number of n < x which are divisible by p is at most xjp. Hence 
x-N(x), the umber of n < x divisible by one or more of 9; 41, Pj+oe 
is not more than 
pf, < ee 
aa Pjt2 
Hence, by (2.6.1), 
ga < N(x) < r, x < 22/42, 
which is false for x > 2?/+2, Hence the series diverges. 
log x 
2log 2 
We take j = 7(x), so that p;,, > x and N(x) = x. We have 
x = N(x) SYN, . PA D> va 
and the first part of Theorem 20 follows on taking logarithms. If we 
put x = Pra, SO that r(x) = n, the second part is immediate. 
By Theorem 20, 7(10°) > 15; a number, of course, still ridiculously 
below the mark. 


THEOREM 20: a(x) > 


«> Ds; pac 


2.7. Further results on formulae for primes. We return for 
a moment to the questions raised in § 1.5. We may ask for ‘a formula 
for primes’ in various senses. 

(i) We may ask for a simple function f(n) which assumes all prime 
values and only prime values, i.e. which takes successively the values 
Py; Py». When n takes the values 1, 2,.... This is the question which we 
discussed in § 1.5. 

5591 c 
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(ii) We may ask for a function which assumes prime values only. 
Fermat’s conjecture, had it been right, would have supplied an answer 
to this question.? As it is, no satisfactory answer is known. 

(iii) We may moderate our demands and ask merely for a function 
which assumes un infinity of prime values. It follows from Euclid’s 
theorem that j’(n) = n is such a function, and less trivial answers are 
given by Theorems 11-15. 

Apart from trivial solutions, Dirichlet’s Theorem 15 is the only 
solution known. It has never been proved that »?+-1, or any other 
quadratic form in n, will represent an infinity of primes, and al] such 
problems seem to be extremely difficult. 

There are some simple negative theorems which contain a very partial 
reply to question (ii). 

THEOREM 21. No polynomial f (n) with integral coefficients, not a con- 
stant, can be prime for all n, or for all sufficiently large n. 

We may assume that the leading coefficient in f (n) is positive, so that 
f(n) > co when n>, and fin) > 1 for n > N, say. If x >N and 
f (=)= agak¥+..=y > 1, 

then f(ry+2) = ary tr) +.. 
is divisible by y for every integral r; and f (ry+x) tends to infiniiy 
with r. Hence there are infinitely many composite values off(n). 

There are quadratic forms which assume prime values for consider- 
able sequences of values of n. Thus n?—n +41 is prime for 0 <n < 40, 


and n?—79n+1601 = (n—40)2+(n—40)+41 
fór 0 <n <79. 


A more general theorem, which we shall prove in § 6.4, is 

THEOREM 22. If fn)= Pin, 2”, 3” ye K”) 
ts a polynomial in its arguments, with integral coefficients, and fin) > œ 
when n > 00, then f(n) is compositefor an infinity of values of n. 

ł It had been suggested that Fermat’s sequence should be replaced by 

241, Mtl WE, H4, an 
The first four numbers are prime, but F,,, the fifth member of this sequence, is now 
known to be composite. Another suggestion wag that the sequence M, , Where p is 
confined to the Mersenne primes, would contain only primes. The first five Mersenne 
primes are 
M,=3, M,=7% M,=3, M, =127, M,,= 8191 
and the sequence proposed would be 
M,, M,, My, Min, Mai. 

The first four are prime but M., is composite. 

t Some care is required in the statement of the theorem, to gyojd such an f(n) as 
2"3"— 6" 5, which is plainly prime for all ņ, 
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2.8. Unsolved problems concerning primes. In § 1.4 we stated 
two conjectural theorems of which no proof is known, although empirical 
evidence makes their truth seem highly probable. There are many other 
conjectural theorems of the same kind. 

There are infinitely many primes n®+ 1. More generally, if a, b, c are 
integers without a common divisor, a ts positive, a+b and c are not both 
even, and b?—4acis not a perfect square, then there are infinitely many 
primes an?+-bn-+c. 


We have already referred to the form n?+1in § 2.7 (iii). If a, b, e 
have a common divisor, there can obviously be at most one prime of 
the form required. If a+b and c are both even, then N = an?+6n-te 
is always even. If b?—4ac = k?, then 


4aN = (2an+b}— k. 
Hence, if N is prime, either 2an+6+k or 2an-+b—k divides 4a, and this. 


can be true for at most a finite number of values of n. The limitations 
stated in the conjecture are therefore essential. 


There is always a prime between n? and (n+ 1)? 

If n > 4 is even, then n is the sum of two odd primes. 

This is ‘Goldbach’s theorem’. 

If n 39 is odd, then n is the sum of three odd primes. 

Any n from some point onwards is a square or the sum of a prime and 
a square. 

This is not true of all n; thus 34 and 58 are exceptions. 

A more dubious conjecture, to which we referred in § 2.5, is 


The number of Fermat primes F, is finite. : 


2.9. Moduli of integers. We now give the proof of Theorems 3 
and 2 which we postponed from § 1.3. Another proof will be given in 
§ 2.11 and a third in § 12.4. Throughout this section integer means 
rational integer, positive or negative. 

The proof depends upon the notion of a ‘modulus’ of numbers. A 
modulus is a system S of numbers such that the sum and difference of 
any two members of S are themselaes members of S: i.e. 


(2.9.1) meS.neS—>(minjeS. 


The numbers of a modulus need not necessarily be integers or even 
rational; they may be complex numbers, or quaternions: but here we 
are concerned only with moduli of integers. 
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The single number 0 forms a modulus (the null modulus). 
It follows from the definition of S that 
aceS>~0=a—aeS8.2=—a+aecS’. 


Repeating the argument, we see that na € S for any integral n (positive 
or negative). More generally 
j2.9.2) aeS.beS-»>zat+ybesS 


for any integral x, y. On the other hand, it is obvious that, if a and b 
are given, the aggregate of values of xa+yb forms a modulus. 

It is plain that any modulus S, except the null modulus, contains 
some positive numbers. Suppose that d is the smallest positive number 
of S. If n is any positive number of S, then n-xd € S for all z. If ¢ is 
the remainder when n is divided by d and 

n= zd+ c, 


then c € S and 0 < c¢ < d. Since d is the smallest positive number of 
S, c = 0 and n = zd. Hence 


Turorem 23. Any modulus, other than the null modulus, is the aggregate 
of integral multiples of a positive number d. 


We define the highest common divisor d of two integers a and b, not 
both zero, as the largest positive integer which divides both a and b; 
and write a= (a b. 


Thus (0, a) = |a|. We may define the highest common divisor 


(@, b, €p k) 


of any set of positive integers a, b, c,..., ķ in the same way. 
The aggregate of numbers of the form 


za+yb, 
for integral x, y, is a modulus which, by Theorem 23, is the aggregatc 


of multiples zc of a certain positive c. Since c divides every number of 
S, it divides a and b, and therefore 


e<d. 

On the other hand, d|a .d|b— d |za+yb, 

so that d divides every number of S, and in particular c. It follows that 
c=d 

and that S is the aggregate of multiples of d. 
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THEOREM 24, The modulus xa+yb is the aggregate of multiples of 
d = (a,b). 


It is plain that we have proved incidentally 


THEOREM 25. The equation 


ax+by = n 
ts soluble in integers x, y if and only if d n In particular, 
ax+by = d 


is soluble. 
THEOREM 26. Any common divisor of a and 6 divides d. 


2.10. Proof of the fundamental theorem of arithmetic. We 
are now in a position to prove Euclid’s theorem 3, and so Theorem 2. 
Suppose that p is prime and p ab. Ifp {a then (a, p) = 1, and there- 
fore, by Theorem 24, there are an x and a y for which xaf yp = 1 or 
xab+ypb = b. 
But p ab and p | pb, and therefore p b. 
Practically the same argument proves 


THEOREM 27: (a, b) =d . c > 0 — wc, be) = dc. 
For there are an x and a y for which za+-yb = d or 


xac+ybe = de. 
Hence (ac, bc) de On the other hand, d a-»dc acandd b — dce be; 
and therefore, by Theorem 26, de | (ac, bc). Hence (ac, bc) = de. 


2.11. Another proof of the fundamental theorem. We call 
numbers which can be factorized into primes in more than one way 
abnormal. Let n be the least abnormal number. The same prime P 
cannot appear in two different factorizations of n, for, if it did, n/P 
would be abnormal and n/P < n. We have then 


n = PiPaPse = nIe- 
where the p and q are primes, no p is a q and no q is a p. 

We may take p, to be the least p; since n is composite, Pi<n. 
Similarly, if q} is the least q, we have gj <n and, since p; Æ 4, it 
follows that p,q, <n. Hence, if N = n—p,q,, we have 0< N <m 
and N is not abnormal. Now p, n and so p, N; similarly q, | N. 
Hence p, and q, both appear in the unique factorization of N and 
214%, IN. From this it follows that p,q, n and hence that gq) %/Pr 
But n/p, is less than n and so has the unique prime factorization Pa Pgz- . . 
Since q; is not a p, this is impossible. Hence there cannot be any ab- 
normal numbers and this is the fundamental theorem. 


22 THE SERIES OF PRIMES [Chap. II 


NOTES ON CHAPTER II 


§ 2.2. Mr. Ingham tells us that the argument used here is due to Bohr and 
Littlewood: see Ingham, 2. 

§ 2.3. For Theorems 11, 12, and 14, see Lucas, Théorie des nombres, i (1891), 
353-4; and for Theorem 15 see Landau, Handbuch, 422-46, and Vorlesungen, i. 
79-96. 

§ 2.4. See Pólya and Szegő, ii. 133, 342. 

$2.5. See Dickson, History, i, chs. i, xv, xvi, Rouse Bal] (Coxeter), 6569, 
and, for numerical results, Kraitchik, Théorie des nombres, i (Paris, 1922), 22, 
218, D. H. Lehmer, Bulletin Amer. Math. Soc. 38 (1932), 3834 and, for the recent 
large primes and factors of Fermat numbers recently obtained by modern high. 
speed computing, Miller and Wheeler, Nature, 168 (1951), 838, Robinson, Proc, 
Amer, Math. Soc. 5 (1954), 842-6, and Math. tables, 11 (1957), 21-22, Riesel, Math. 
tables, 12 (1958), 60, Hurwita and Selfridge, Amer. Math. Soc. Notices, 8 (1961). 601. 
See D, H. Gillies [Math. Computation 18 ( 1964), 93-5] for the three largest Mergenne 
primes and for references, 

Ferrier’s prime is (2448+ ]) / 17 and is the largest prime found without the use 
of electronic computing (and may well remain so). 

Much information about large numbers known to be prime is to be found in 
Sphinx (Brussels, 1931-9). A list in vol. 6 (1936), 166, gives all those (336 in 
number) between 10!2— 104 and 10", and gne in vol. 8 (1938), 86, those between 
10! and 10'2-+. 104, In addition to this, Kraitchik, in vol. 3 (1933), 99101, gives 
a list of 161 primes ranging from 1,018,412,127,823 to 2127 1, mostly factors of 
numbers 2*4- 1. This list supersedes an earlier list in Mathemutica (Cluj), 7 (1933). 
93-94; and Kraitchik himself and other writers add substantially to it in later. 
numbers. See also Rouse Ball (Coxeter), 62-65. 

Our proof that 641 F, is taken from Kraitchik, Théorie des nombres, ii (Paris, 
1926), 221. 


§ 2.6. See Erdős, Mathematica, B,7 (1938), 1-2. Theorem 19 was proved by 
Euler in 1737. 

§ 2.7. Theorem 21 is due to Goldbach (1752) and Theorem 22 to Morgan Ward, 
Journal London Math. Soc. 5 (1930), 106-7. 

§ 2.8. ‘Goldbach’s theorem’ wag enunciated by Goldbach in a letter to Euler in 
1742. It is still unproved, but Vinogradov proved in 1937 that all odd numbers from 
a certain point onwards are sums of three odd primes. van der Corput and Ester- 
mann used his method to prove that ‘almost all’ even numbers are sums of two 
primes. See Estermann, Introduction, for Vinogradov’s proof, and James, Bulletin 
Amer. Math. Soc. 55 (1949), 24660, for an account of recent work in this field. 

Mr. A. K. Austin and Professor P. T. Bateman each drew my attention to the 
falsehood of one of the conjectures in this section in the third edition. 

§§ 2.9-10, The argument follows the lines of Hecke, ch. i. The definition of 
a modulus is the natural one, but is redundant. It is sufficient to assume that 


For then meS.néeS a>m—nes. 
0=n—neS, —n=0-—neS, mtn = m-(-n) € S. 


§ 2.11. F. A. Lindemann, Quart. J. of Math. (Oxford), 4 (1933), 319-20, and 
Davenport, Higher arithmetic, 20. For somewhat similar proofs, see Zermelo, 


Göttinger Nachrichten (new series), i (1934), 43-44, and Hasse, Journal jiir Math. 
159 (1928), 3-6. 


III 
FAREY SERIES AND A THEOREM OF MINKOWSKI 


3.1. The definition and simplest properties of a Farey series. 
In this chapter we shall be concerned primarily with certain properties 
of the ‘positive rationals’ or ‘vulgar fractions’, such as 4 or-34. Such 
a fraction may be regarded as a relation between two positive integers, 
and the theorems which we prove embody properties of the positive 
integers. 

The Farey series §, of order n is the ascending series of irreducible 
fractions between 0 and 1 whose denominators do not exceed n. Thus 
hjk belongs to &,, if 
(3.1.1) O<h<k<n, (hk) =1; 
the numbers 0 and 1 are included in the forms $ and }. For example, 


Bs is 0111213238341 
? DEDDPEDDDPHOL 


The characteristic properties of Farey series are expressed by the 
following theorems. 

Tuzorem 28. If h/k and h’/k’ are two successive terms of &,,, then 
(3.1.2) kh’-hk’ = 1. 

Tueorem 29, If h/k, h”/k”, and h’/k’ are three successive terms of Yn, 
then WY how 
(3.1.3) P7 IEF 

We shall prove that the two theorems are equivalent in the next 
section, and then give three different proofs of both of them, in §§ 3.3, 
3.4, and 3.7 respectively. We conclude this section by proving two still 
simpler properties of p 

THEOREM 30. If h/k and h'/k’ are two successive terms of G,,, then 
(3.1.4) kfk’ > n. 
h+k 
L+k't 
of h/k and h’/k’ falls in the interval 


h w 
Rey 


Hence, unless (3.1.4) is true, there is another term of §, between h/k 
and h’/k’. 


The ‘mediant ’ 


t Or the reduced form of this fraction. 
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THEOREM 31. If n > 1, then no two successive terms of §, have the 
same denominator. 
If k > 1 and h'/k succeeds hA/k in §,, then h+1< h’ < k. But then 
h h à+ Ok, 
a as ar es 
and h/(k—1)f cornes between h/k and h’/k in §,, a contradiction, 


3.2. The equivalence of the two characteristic properties. 
We now prove that each of Theorems 28 and 29 implies the other. 

(1) Theorem 28 implies Theorem 29. If we assume Theorem 28, and 
solve the equations 
(3.2.1) kh" —hk" = 1, k'h’—h"k' = 1 
for h” and k”, we obtain 

h”(kh’-hk’) = hth’, kh" (kh’ —hk’) = k-f-k’ 
and so (3.1.8). 

(2) Theorem 29 implies Theorem 28. We assume that Theorem 29 is 
true generally and that Theorem 28 is true for ¥,,_,, and deduce that 
Theorem 28 is true for ¥,. It is plainly sufficient to prove that the 
equations (3.2.1) are satisfied when h”/k” belongs to %, but not to 
8n-1, SO that k” = n. In this case, after Theorem 31, both k and k’ 
are less than k”, and h/k and h’/k’ are consecutive terms in §,_}. 

Since (3.1.3) is true ex hypothesi, and h"/k” is irreducible, we have 

h+k = Ah", k+k = Ak’, 
where A is an integer. Since k and k’ are both less than k”, A must be 1. 


Hence h” =h4h’, k” =k+k’, 
kh" —hk” = kh'—hk' = 1; 
and similarly k*h'—h'k’ = 1, 


3.3. First proof of Theorems 28 and 29. Our first proof is a 
natural development of the ideas used in § 3.2. 

The theorems are true for n = 1; we assume them true for ,,_, and 
prove them true for %,,. 

Suppose that h/k and h’/k’ are consecutive in §,-, but separated by 
h'jk” in f Let 
(3.3.1) kh”-hk” = r > 0, k'k—h'k =s > 0. 

+ Or the reduced form of this fraction. 


į After Theorem 31, }”/k” is the only term of {%, between h/k and h’/k’ ; but we do 
not assume this in the proof. 
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Solving these equations for h” and k”, and remembering that 
kh’ —hk' = 

we obtain 

(3.3.2) h” = sh+rh’, k” = sk+rk’. 

Here (r,s) = 1, since (h”, k”) = 1. 

Consider now the set S of all fractions 
phy 
T ke ak! 
in which À and p are positive integers and (À, u) = 1. Thus h’/k" 
belongs to S. Every fraction of S lies between h/k and h’'/k’, and is in 
its lowest terms, since any common divisor of H and K would divide 

(uhh!) —h(pk--Ak') = 
and h'(uk+àk')—k'(uh-+àh') = 
Hence every fraction of S appears sooner or later in some %,; and plainly 
the first to make its appearance is that for which K is least, i.e. that 
for which A = l and p = 1. This fraction must be h”/k”, and so 
(3.3.4) h” = Ah’, k = kth’. 

This proves Theorem 29. It is to be observecl that the equations 
(3.3.4) are not generally true for three successive fractions of ¥,, but 
are (as we have shown) true when the central fraction has made its 
first appearance in §,. 


(3.3.3) 


3.4, Second proof of the theorems. This proof is not inductive, 
and gives a rule for the construction of the term which succeeds %/k 


in §,. 
Since (h, k) = 1, the equation 


is soluble in integers (Theorem 25). If £p y,, is a solution then 
totrh, Yotrk 
is also a solution for any positive or negative integral r, We can choose 
r SO that n-k < ytrk <n 
There is therefore a solution (x, y) of (8.4.1) such that 
(3.4.2) (z,y) = 1 O<n-—k<ycn 
Since 2/y is in its eae terms, andy <n, 2/y is a fraction of §,. 


Also a 1 =. 
Pi 


3 
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so that a/y cornes later in §, than A/k. If it is not h’/k’, it cornes later 
than h’/k', and ae 
h Rehy 1. 


> hh hk’ 1. 
wae KO a ? W 
1 kx-hy x h 1 1 k+y 
Hence CS he es ghee 
ky okey” y * BRY. tap kk’y 
n. 1 
> 
> iky? ky’ 


by (3.4.2). This is a contradiction, and therefore z/y must be h’/k’, and 
kh'—hk’ = 1. 

Thus, to find the successor of 4 in Fig we begin by finding some solution (2%, y,,) 
of 9z— 4y = 1, e.g. 4 = 1, yo = 2. We then choose 7 SO that 2+-9r lies between 
13-9 = 4 and 13. This gives f = 1, x = 1+4r = 5, y = 2+97 = 11, and the 
fraction required is fo. 


3.5. The integral lattice. Our third and last proof depends on 
simple but important geometrical ideas. 

Suppose that we are given an origin 0 in the plane and two points 
P, Q not collinear with 0. We complete the parallelogram OPQR, 
produce its aides indefinitely, and draw the two systems of equidistant 
parallels of which OP, QR and OQ, PR are consecutive pairs, thus 
dividing the plane into an infinity of equal parallelograms. Such a 
figure is called a lattice (Gitter). 

A lattice is a figure of lines. It defines a figure of points, viz. the 
system of points of intersection of the lines, or lattice points. Such 
a system we call a point-Zattice. 

Two different lattices may determine the same point-lattice; thus 
in Fig. 1 the lattices based on OP, OQ and on OP, OR determine the 
same system of points. Two lattices which determine the same point- 
lattice are said to be equivalent. 

It is plain that any lattice point of a lattice might be regarded as the 
origin 0, and that the properties of the lattice are independent of the 
choice of origin and symmetrical about any origin. 

One type of lattice is particularly important here. This is the lattice 
which is formed (when the rectangular coordinate axes are given) by 
parallels to the axes at unit distances, dividing the plane into unit 
squares. We call this the fundamental lattice L, and the point-lattice 
which it determines, viz. the system of points (x, y) with integral coordi- 
nates, the fundamental point-luttice A. 
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Any point-lattice may be regarded as a system of numbers or vectors, 
the complex coordinates x+iy of the 
lattice points or the vectors to these 
points from the origin. Such a system 
is plainly a modulus in the sense of 
$2.9. If P and Q are the points (x1, Y1) 


and (zə, Yə), then the coordinates of 
any point S of the lattice based upon 
OP and OQ are 
X = MX FNT, = MYLFNYz, 
where m and n are integers; or if z, and 
z, are the complex coordinates of P 
and Q, then the complex coordinate 
of S is z= m+ nz. 
3.6. Some simple properties of 
the fundamental lattice. (1) We 


now consider the transformation de- 


fined by 
(3.6.1) x = ax+by, y = cæz+dy, 
where a, b, c, d are given, positive Fic. 1 


or negative, integers. It is plain 
that any point (x,y) of A is tran.sformed into another point (x', y’) 
of A. 

Solving (8.6.1) for x and y, we obtain 


_ OX! = by! _ cx —ay’ 
(3.6.2) aor E Ba E 
If 
(3.6.3) A = ad-be = fl, 


then any integral values of x’ and y’ give integral values of x and y, 
and every lattice point (x’,y’) corresponds to a lattice point (x,y). In 
this case A is transformed into itself. 

Conversely, if A is transformed into itself, every integral (x’, y’) must 
give an integral (x, y). Taking in particular (x’, y’) to be (1,0) and (0, 1), 


we see that Ald, Alb, Ac, Ala, 


and so A? ad-bc, A@/A, 
Hence A = fl. 
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We have thus proved 


Tarorem 32. A necessary and sufficient condition that the transforma- 
tion (3.6.1) should transform A into ttself is that A = £1. 


We call such a transformation unimodular. 


Fie. 2a Fic. 2b 


Fic. 20 


(2) Suppose now that P and Q are the lattice points (a,c) and (6, d) 

of A. The area of the parallelogram defined by OP and OQ is 
8 = +(ad—bc) = \ad—bel, 
the sign being chosen to make 5 positive. The points (x’,y’) of the 
lattice A’ based on OP and OQ are given by 
x’ = xa+yb, y = ac+yd, 

where x and y are arbitrary integers. After Theorem 32, a necessary 
and sufficient condition that A’ should be identical with A is that 
s= 1. 

TuzorEM 33. A necessary and sufficient condition that the lattice L’ 
based upon OP and OQ should be equivalent to L is that the area of the 
parallelogram defined by OP and OQ should be unity. 
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(3) We call a point P of A visible (i.e. visible from the origin) if there 
is no point of A on OP between 0 and P. In order that (x, y) should 
be visible, it is necessary and sufficient that x/y should be in its lowest 
terms, or (z, y) = 1. 

THEOREM 34. Suppose that P and Q are visible points of A, and that 
dis the area of the parallelogram J defined by OP and OQ. Then 

(i) if ò = 1, there is no point of A inside J; 

Gi) if ô > 1, there is at least one point of A inside J, and, unless that 
point is the intersection of the diagonals of J, at least two, one in each of 
the triangles into which J is divided by PQ. 

There is no point of A inside J if and only if the lattice L’ based on 
OP and OQ is equivalent to L, i.e. if and only if § = 1. If § > 1,-there 
is at least one such point S. If R is the fourth vertex of the parallelo- 
gram J, and RT is parallel and equal to OS, but with the opposite sense, 
then (since the properties of a lattice are symmetrical, and independent 
of the particular lattice point chosen as origin) T is also a point of A, 
and there are at least two points of A inside J unless T coincides with 
S. This is the special case mentioned under (ii). 

The different cases are illustrated. in Figs. 2 a, 2 b, 2 c. 


3.7. Third proof of Theorems 28 and 29. The fractions h/k with 
O<hKK<KN, (h,k) =1 
are the fractions of %,,, and correspond to the visible points (k, h) of A 
inside, or on the boundary of, the triangle defined by the lines y = 0, 
y=x,x=n. 

If we draw a ray through 0 and rotate it round the origin in the 
counter-clockwise direction from an initial position along the axis of x, 
it will pass in turn through each point (k, h) representative of a Farey 
fraction. If P and P’ are points (k,h) and (k’, h’) representing con- 
secutive fractions, there is no representative point inside the triangle 
OPP’ or on the join PP’, and therefore, by Theorem 34, 

kh’ —hk' = 1. 

3.8. The Farey dissection of the continuum. It is often con- 
venient to represent the real numbers on a circle instead of, as usual, 
on a straight line, the object of the circular representation being to 
eliminate integral parts. We take a circle C of unit circumference, and 
an arbitrary point 0 of the circumference as the representative of 0, 
and represent x by the point P, whose distance from 0, measured round 
the circumference in the counter-clockwise direction, is x. Plainly all 
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integers are represented by the same point 0, and numbers which differ 
by an integer have the game representative point. 

It is sometimes useful to divide up the circumference of C in the 
following manner. We take the Farey series §,, and form all the 
mediants 


h+h’ 
la = k+k' 
of successive pairs h/k, h'/k'. The first and last mediants are 
0+1_ ı n—l+l_ n 


ln n+l’ nI ~ +1" 
The mediants naturally do not belong themselves to %,. 
We now represent each mediant yu by the point P,. The circle is thus 
divided up into arcs which we call Farey arcs, each bounded by two 
points P, and containing one Farey point, the representative of a term 


of Fy Thus P l 


ares] 
is a Farey arc containing the one Farey point 0. The aggregate of 
Farey arcs we call the Farey dissection of the circle. 

In what follows we suppose that n> 1. If P, iœ is a Farey point, and 
h,/k,, hajko are the terms of §,, which precede and follow h/k, then the 
Farey arc round P,,, is composed of two parts, whose lengths are 


ho h+hy _ 1 h+h, h 1 

k kth, kkFk)  kEk, k” R(k-K,) 
respectively. Now kk, < 2n, since k and k; are unequal (Theorem 31) 
and neither exceeds n; and k+k, > n, by Theorem 30. We thus obtain 


TuroreM 35. In the Farey dissection of order n, where n > 1, each 
part of the arc which contains the representative of h/k has a length between 
1 1 
k(2n—1)’ k(n-+1)' 

The dissection, in fact, has a certain ‘uniformity’ which explains its 
importance. 

We use the Farey dissection here to prove a simple theorem concern- 
ing the approximation of arbitrary real numbers by rationals, a topic 
to which we shall return in Ch. XI. 

Tuzorem 36. If éis any real number, and n a positive integer, then 
there 4g an irreducible fraction h/k such that 


h 
aE 


l 


3.8.1 -ei 
( ) O<k<n, nth) 


|< 
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We may suppose that 0 < é < 1. Then & falls in an interval bounded 
by two successive fractions of §,, say h/k and h’/k’, and therefore in 
one of the intervals 

h h+k h+k k 

Er) Err) 
Hence, after Theorem 35, either h/k or h'/k' satisfies the conditions: 
h/k if € falls in the first interval, h’ [k if it falls in the second. 


3.9. A theorem of Minkowski. If P and Q are points of A, P’ 
and Q’ the points symmetrical to P and Q about the origin, and we add 
to the parallelogram J of Theorem 34 the three parallelograms based 
on OQ, OP’, on OP’, OQ’, and on OQ’, OP, we obtain a parallelogram 
K whose centre is the origin and whose area 46 is four times that-of J. 
If 5 has the value 1 (its least possible value) there are points of A on 
the boundary of K, but none, except 0, inside. If § > 1, then there are 
points of A, other than 0, inside K, This is a very special case of a 
famous theorem of Minkowski, which asserts that the same property is 
possessed, not only by any parallelogram symmetrical about the origin 
(whether generated by points of A or not), but by any ‘convex region’ 
symmetrical about the origin. 

An open region R isa set of points with the properties (1) if P belongs 
to R, then all points of the plane sufficiently near to P belong to R, 
(2) any two points of R can be joined by a continuous curve lying 
entirely in R. We may also express (1) by saying that any point of R 
is an interior point of R. Thus the i:nside of a circle or a parallelogram 
is an open region. The boundary C of R is the set of points which are 
limit points of R but do not themselves belong to R Thus the boundary 
of a circle is its circumference. A closed region R* is an open region R 
together with its boundary. We consider only bounded regions. 

There are two natural definitions of a convex region, which may be 
shown to be equivalent. First, we may say that R (or R*) is convex 
if every point of any chord of R, i.e. of any line joining two points of 
R, belongs to R. Secondly, we may say that R (or R*) is convex if it 
is possible, through every point P of C, to draw at least one line / such 
that the whole of R lies on one side of 1. Thus a circle and a parallelo- 
gram are convex; for the circle, / is the tangent at P, while for the 
parallelogram every line / is a side except at the vertices, where there 
are an infinity of lines with the property required. 

It is easy to prove the equivalence of the two definitions. Suppose first that 


R is convex according to the second definition, that P and Q belong to R, and 
that a point S of PQ does not. Then there is a point T of C (which may be S 
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itself) on PS, and a line 7 through T which leaves R entirely on one side; and, 
since all points sufficiently near to P or Q belong to R, this is a contradiction. 

Secondly, suppose that R is convex according to the first definition and that 
P is a point of C; and consider the set L of lines joining P to points of R. If Y, 
and Y, are points of R, and Y is a point of Y,Y,, then Y is a point of R and PY 
a line of L. Hence there is an angle APB such that every line from P within 
APB, and no line outside APB, belongs to L. If APB > 7, then there are 
points D, E of R such that DE passes through P, in which case P belongs to 
R and not to C, a contradiction. Hence APB < 7, If APB = 7, then AB is 
a line 1; if APB < 7, then any line through P, outside the angle, is a line 1. 


It is plain that convexity is invariant for translations and for magni- 
fications about a point 0. 

A convex region R has an area (definable, for example, as the upper 
bound of the areas of networks of small squares whose vertices lie in R). 


THEOREM 37 (MINKOWSKI’S THEOREM). Any convex region R sym- 
metrical about 0, and of area greater than 4, includes points of A other 
than 0. 


3.10. Proof of Minkowski’s theorem. We begin by proving. a 
simple theorem whose truth is ‘intuitive’. 


THEOREM 38. Suppose that Rg is an open region including 0, that 
Rp is the congruent and similarly sitwuted region about any point P of A, 
and that no two of the regions Rp overlap. Then the area of Ry does not 
exceed 1. 


The theorem becomes ‘obvious’ when we consider that, if Rg were 
the square bounded by the lines x = +4,y = +4, then the area of 
Rg would be 1 and the regions Rp, with their boundaries, would cover 
the plane. We may give an exact proof as follows. 

Suppose that A is the area of A,, and A the maximum distance of 
a point of Cof from 0; and that we consider the (2n+1)? regions Rp 
corresponding to points of A whose coordinates are not greater numeri- 
cally than n. All these regions lie in the square whose sides are parallel 
to the axes and at a distance n+ A from 0. Hence (since the regions 
do not overlap) 

2 2 <=) 
(2n+ 1)?A < (2n42A4)?, A< ( may? 
and the result follows when we make n tend to infinity. 

It is to be noticed that there is no reference to symmetry or to con- 
vexity in Theorem 38. 


+ We use C systematically for the boundary of the corresponding R. 
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It is now easy to prove Minkowski’s theorem. Minkowski himself 
gave two proofs, based on the two definitions of convexity. 

(1) Take the first definition, and suppose that Rg is the result of 
contracting R about 0 to half its linear dimensions. Then the area of 
Ro is greater than 1, so that two of the regions Rp of Theorem 38 
overlap, and there is a lattice-point P such that Ro and Rp overlap. 
Let Q (Fig. 3a) be a point common to Ro and R,. If OQ’ is equal 
and parallel to PQ, and Q” is the image of Q’ in 0, then Q’, and there- 


(a) 


Fre. 3 


fore Q”, lies in Ro; and therefore, by the definition of convexity, the 
middle point of QQ” lies in R,. But this point is the middle point of 
OP; and therefore P lies in R. 

(2) Take the second definition, and suppose that there is no lattice 
point but 0 in R. Expand R* about 0 until, as R'*, it first includes 
a lattice point P. Then P is a point of C’, and there is a line l, say V’, 
through P (Fig. 3 b). If Ro is R’ contracted about 0 to half its linear 
dimensions, and [, is the parallel to / through the middle point of OP, 
then Jy is a line l for R,. It is plainly also a line | for Rp, and leaves 
Ro and Fp on opposite sides, so that Ry and Rp do not overlap. 
A fortiori Rg does not overlap any other Rp, and, since the area of 
Rois greater than 1, this contradicts Theorem 38. 

There are a number of interesting alternative proofs, of which per- 
haps the simplest is one due to Mo-rdell. 

If R is convex and symmetrical about 0, and P, and P, are points 
of R with coordinates (x,, y,) and (25, Y), then (—x,, —y,),and there- 
fore the point M whose coordinates are 4(x,—2,) and 4(y,—Y,), is also 
a point of R. 

The lines x = 2p/t, y = 2q/t, where ¢ is a fixed positive integer and 
p and q arbitrary integers, divide up the plane into squares, of area 
4/t?, whose corners are (2p/t, 2q/t). If NŒ) is the number of corners in 


R, and A the area of R, then pla:inly 4¢-2N(t) > A when ¢ > œ; and 
5591 D 
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if A > 4 then N(t) > ¢? for large ¢. But the pairs (p,q) give at most 
ł different pairs of remainders when p and q are divided by ¢; and 
thereforc there are two points P, and P, of R, with coordinates 2p,/t, 
2q,/t and 2p,/t, 2q,/t, such that py—p, and g,—gq, are both divisible by t, 
Hence the point M, which belongs to R, is a point of A. 


3.11. Developments of Theorem 37. There are some further 
developments of Theorem 37 which will be wanted in Ch. XXIV and 
which it is natural to prove here. We begin with a general remark 
which applies to all the theorems of §§ 3.6 and 3.9-10. 

We have been interested primarily in the ‘fundamental’ lattice L 
(or A), but we can see in various ways how its properties may be 
restated as general properties of lattices. We use L or A now for any 
lattice of lines or points. If it is based upon the points 0, P, Q, as in 
§ 3.5, then we call the parallelogram OPRQ the fundamental parallelo- 
gram of L or A. 

(i) We may set up a system of oblique Cartesian coordinates with 
OP, OQ as axes, and agree that P and Q are the points (1,0) and (0, 1). 
The area of the fundamental parallelogram is then 


6 = OP.OQ.sina, 


where w is the angle between OP and OQ. The arguments of § 3.6, 
interpreted in this system of coordinates, then prove 


TueoreM 39. A necessary and sufficient condition that the transforma- 
tion (3.6.1) shall transform A into itself is that A = + 1. 


TueoreM 40. If P and Q are any two points of A, then a necessary 
and sufficient condition that the lattice L based upon OP and OQ should 
be equivalent to L is that the area of the parallelogram defined by OP, OQ 
should be equal to that of the fundamental parallelogram of A. 


(ii) The transformation 


a’ = ox+By, y= yu+dy 
(where now a, B, y, 6 are any real numbers) transforms the fundamental 
lattice of $3.5 into the lattice ba§$ed upon the origin and the points 
(x, y), (8,5). It transforms lines into lines and triangles into triangles. 
If the triangle P, P, P,, where P, is the point (z,, y,),is transformed into 
Q, Qz Qs, then the areas of the triangles are 


ty | 
+4]! s yo I 
£a y3 | 


The ô of this paragraph has no connexion with the § of (i). which reappears below. 
paragrap ‘PP 
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and 
ot +By, ya,+dy, 1 2i Yy l 
+ł aLa-+HByYa yr,tby, l| = +4(að— py) | zə Ya 1 
ats+By, yxts+Sy, | Z3 Yz 1 


Thus areas of triangles are multiplied by the constant factor |x — £y]; 
and the same is true of areas in general, since these are sums, or limits 
of sums, of areas of triangles. 

We can therefore generalize any property of the fundamental lattice 
by an appropriate linear transformation. The generalization of Theorem 
38 is 

TueoreM 41. Suppose that A is any lattice with origin 0, and that 
Ro satisfies (with respect to A) the conditions Stated in Theorem 38. Then 
the area of Ro does not exceed that of the fundamental parallelogram of A 


It is convenient also to give a proof ab initio which we state at length, 
since we use similar ideas in our proof of the next theorem. The proof, 
on the lines of (i) above, is practically the same as that in § 3.10. 

The lines x= +n, y= +n 
define a parallelogram JI of area 4n25, with (2n+ 1)? points P of A 
inside it or on its boundary. We co:nsider the (2n-+- 1)? regions Rp corre- 
sponding to these points. If A is the greatest value of |x| or |y| on Co, 
then all these regions lie inside the parallelogram II’, of area 4(n-+.A)?6, 
bounded by the lines 


x= +(n+A), Y= +(n+4); 


and (2n-+-1)?A < 4(n+A)%6. 
Hence, making n -> œ, we obtain 
A<é. 


We need one more theorem which concerns the iimiting case A = ô, 
We suppose that Rọ is a parallelogram; what we prove on this hypo- 
thesis will be sufficient for our purposes in Ch. XXIV. 

We say that two points (x, y) and (x’, y’) are equivalent with respect 
to L if they have similar positions in two parallelograms of L (so that 
they would coincide if one parallelogram were moved into coincidence 
with the other by parallel displacement). If L is based upon OP and 
OQ, and P and Q are (x,, y,) and (£z, Y2), then the conditions that the 
points (x,y) and (z’,y’) should be equivalent are that 


L'—E = 12,+8Xo, y' —Y = TY + Y2 
where y and s are integers. 
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THEOREM 42. If Rg is a parallelogram whose area is equal to that of 
the fundamental parallelogram of L, and there are no two equivalent points 
inside Ro, then there is a point, inside Rg or on its boundary, equivalent 
to any given point of the plane. 


We denote the closed region corresponding to Rp by R5. 

The hypothesis that Rg includes no pair of equivalent points is equi- 
valent to the hypothesis that no two Rp overlap. The conclusion that 
there is a point of RS equivalent to any point of the plane is equivalent 
to the conclusion that the RT, cover the plane. Hence what we have to 
prove is that, if A = ĝ and the Rp do not overlap, then the R¥ cover the 
plane. 

Suppose the contrary. Then there is a point Q outside all Rž. This 
point Q lies inside or on the boundary of some parallelogram of L, and 
there is a region D, in this parallelogram, and of positive area 7, outside 
all Rp; and a corresponding region in every parallelogram of L. Hence 
the area of all Rp, inside the parallelogram II’ of area 4(n+A)8, does 
not exceed 4(8—n)(n+-A-+1)2, 

It follows that (2n-+1)?8 < 4(8—7y)(n+A+1); 
and therefore, making n > œ, 

è < 5—n, 
a contradiction which proves the theorem. 

Finally, we may remark that all these theorems may be extended 
to space of any number of dimensions. Thus if A is the fundamental 
point-lattice in three-dimensional space, i.e. the set of points (x, y, z) 
with integral coordinates, R is a convex region symmetrical about the 
origin, and of volume greater than 8, then there are points of A, other 
than 0, in R. In n dimensions 8 must be replaced by 2”, We shall 


say something about this generalization, which does not require new 
ideas, in Ch. XXIV. 


NOTES ON CHAPTER III 


§ 3.1. The history of ‘Farey series’ is very curious, Theorems 28 and 29 seem 
to have been stated and proved first by Haros in 1802; see Dickson, History, 
i. 156. Farey did not publish anything on the subject until 1816, when he stated 
Theorem 29 in a note in the Philosophical Magazine. He gave no proof, and it 
is unlikely that he had found one, since hé seems to have been at the best an 
indifferent mathematician. 

Cauchy, however, saw Farey’s statement, and supplied the proof ( Exercices de 
mathématiques, i. 114-16). Mathematicians generally have followed Cauchy’s 
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examplc in attributing thc results to Farey, and the geries will no doubt continuc 
to bear his name. 

Farcy has a notice of twenty lines in the Dictionary of national biography, where 
he is described as a geologist. As a geologist he is forgotten, and his biographer 
does not mention the one thing in his life which survives. 

§ 3.3. Hurwitz, Math. Annalen, 44 (1894), 417-36. 

§ 3.4. Landau, Vorlesungen, i. 98-100. 

§§ 3.5-7. Here we follow the lines of a lecture by Professor Pólya. 

§ 3.8. For Theorem 36 see Landau, Vorlesungen, i. 100. 

§ 3.9. The reader need not pay much attention to the definitions of ‘region’, 
‘boundary’, etc., given in this section if he does not wish to; he will not lose by 
thinking in terms of elementary regions such as parallelograms, polygons, or 
ellipses. Convex regions are simple regions involving no ‘topological’ difficultios. 
That a convex region has an area was first proved by Minkowski (Geometrie der 
Zahlen, Kap. 2). 

§ 3.10. Minkowski's first proof wil] be found in Geometrie der Zahlen, 73--76, 
and his second in Diophantische Approximationen, 28-30. Mordell's proof was 
given in Compositio Math. 1 (1934), 248-53, Another interesting proof is that by 
Hajós, Acta Univ. Hungaricae (Szegod), 6 (1934), 224-5: this was set oyt in full 
in the first edition of this book. 


IV 
IRRATIONAL NUMBERS 


4.1. Some generalities. The theory of ‘irrational number’, as 
explained in text books of analysis, falls outside the range of arith- 
metic. The theory of numbers is occupied, first with integers, then 
with rationals, as relations between integers, and then with irrationals, 
real or complex, of special forms, such as 


rtsv2,  r+s/(—5), 
where y and g are rational. It is not properly concerned with irrationals 
as a whole or with general criteria for irrationality (though this is a 
limitation which we shall not always respect). 

There are, however, many problems of irrationality which may be 
regarded as part of arithmetic. Theorems concerning rationals may be 
restated as theorems about integers; thus the theorem 

‘3483 = 3 is insoluble in rationals’ 
may be restated in the form 
‘a3d3+-b3c3 = 363d? is insoluble in integers’: 
and the same is true of many theorems in which ‘irrationality’ inter- 
venes. Thus 


(P) “4/2 is irrational’ 
means 
(Q) ʻa? = 2b? is insoluble in integers’, 


and then appears as a properly arithmetical theorem. We may ask 
‘is y2 irrational ?’ without trespassing beyond the proper bounds of 
arithmetic, and need not ask ‘what is the meaning of ¥2?’ We do not 
require any interpretation of the isolated symbol v2, since the meaning 
of (P) is defined as a whole and as being the same as that of (Q).t 
In this chapter we shall be occupied with the problem 
‘is x rational or irrational? ’, 


x being a number which, like 2, e, or 7, makes its appearance naturally 
in analysis. 


4.2. Numbers known to be irrational. The problem which we 
are considering is generally difficult, and there are few different types 
of numbers x for which the solution has been found. In this chapter 


+ In short v2 may be treated here ag an ‘incomplete symbol’ in the sense of Principia 
Mathematica, 
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we shall confine our attention to a few of the simplest cases, but it 
may be convenient to begin by a rough general statement of what is 
known. The statement must be rough because any more precise state- 
ment requires ideas which we have not yet defined. 

There are, broadly, among numbers which occur naturally in analysis, 
two types of numbers whose irrationality has been established. 

(a) Algebraic irrationals. The iirrationality of ¥2 was proved by 
Pythagoras or his pupils, and later Greek mathematicians extended the 
conclusion to 43 and other square roots. It is now easy to prove that 

mJN 
is generally irrational for integral m and N. Still more generally, 
numbers defined by algebraic equations with integral coefficients, unless 
‘obviously’ rational, can be shown to be irrational by the use of a 
theorem of Gauss. We prove this theorem (Theorem 45) in § 4.3. 

(b) The numbers e and 7 and numbers derived from them. It is easy 
to prove e irrational (see § 4.7); and the proof, simple as it is, involves 
the ideas which are most funclamental in later extensions of the theorem. 
n is irrational, but of this there is no really simple proof. All powers of 
e or 7, and polynomials in e or m with rational coefficients, are irrational. 
Numbers such as 

e”?, e's, V7e%2, log 2 
are irrational. We shall return to this subject in Ch. XI (§§ 11.13-14). 

It was not until 1929 that theorems were discovered which go beyond 
those of §§ 11.13-14 in any very important way. It has been shown 
recently that further classes of numbers, in which 


e7, 22, en 
are included, are irrational. The irrationality of such numbers as 
fi 
, 2 
2e, tr, We, 


or ‘Euler’s constant’ yis still unproved. 


4.3. The theorem of Pythagoras and its generalizations. We 
shall begin by proving 

Torm 43 (PYTHAGORAS THEOREM). v2 is irrational. 

We shall give three proofs of this theorem, two here and one in § 4.6. 
The theorem and its simplest generslizations, though trivial now, deserve 
intensive study. The old Greek theory of proportion was based on the 


Bp 1 1 
t+ y= lin (1+5 tt = —log n} 


40 IRRATIONAL NUMBERS [Chap. IV 


hypothesis that magnitudes of the game kind were necessarily com- 
mensurable, and it was the discovery of Pythagoras which, by exposing 
the inadequacy of this theory, opened the way for the more profound 
theory of Eudoxus which is set out in Euclid v. 

(a) First proof. The traditional proof ascribed to Pythagoras runs 
as follows. If ¥2 is rational, then the equation 


(4.3.1) a? = 2b? 


is soluble in integers a, b with (a, b) = 1. Hence q? is even, and there- 
fore a is even. If a = 2¢, then 4¢2 = 2b?, 2c? = b?, and b is also even, 
contrary to the hypothesis that (a, b) = 1. 

(b) Second proof. It follows from (4.3.1) that b a?, and a fortiori that 
p a? for any prime factor p of b. Hence p a. Since (a, b) = 1, this is 
impossible. Hence b = ] and 2 is the square of an integer a, which is 
false . 

The two proofs are very similar, but there is an important difference. 
In (a) we consider divisibility by 2, a given number; in (b) we consider 
divisibility by the unknown number b. For this reason (a) is, as we 
shall see in a moment, the logically simpler proof, while (b) lends itself 
more readily to generalization. 

Similar arguments prove the more general 


THEOREM 44. "/N is irrational, unless N is the m-th power of an 
integer n. 
The proofs corresponding to (a) and (b) above may be stated thus. 
(a) Suppose that 
(4.3.2) am — Nom 


where (a, b) = 1. If p is any prime factor of N, then p a™ and there- 
fore p a. If pê is the highest power of p which divides a, so that 


a= pa, P fa, 
then p™a™ = Nb”. 


But p { band p { «, and therefore N is divisible by p” and by no higher 
power of p. Since this is true of all prime factors of N, N is an mth 
power. 

(b) It follows from (4.3.2) that b a”, andp a” for every prime factor 
pof b. Hence p a, and from this it follows as before that b = 1. It 
will be observed that this proof is almost the same as the second proof 
of Theorem 43. whereas (a) has become noticeably more complex. 
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A still more general theorem is 
THEOREM 45. If x isa root of an equation 
atoms ote, = 0, 

with integral coefficients of which the first is unity, then x is either integral 
or irrational, 

In the particular case in which the equation is 

am™—_N = 0, 

Theorem 45 reduces to Theorem 44. 

We may plainly suppose that cm + 0. We argue as under (b) above. 
If x = a/b, where (a, b) = 1, then 

a™+ca™lb+,.te,b™ = 0. 

Hence b q”, and from this it follows as before that b = 1. 


4.4. The use of the fundamental theorem in the proofs of 
Theorems 4345. It is important, in view of the historical discussion 
in the next section, to observe what use is made, in the proofs of 
§ 4.3, of the fundamental theorem of arithmetic or of the ‘equivalent’ 
Theorem 3. 

The critical inference, in either proof of Theorem 44, is 

“pla” -> pla’. 
Here we use Theorem 3. The same remark applies to the second proof 
of Theorem 43, the only simplification being that m = 2. In all these 
proofs Theorem 3 plays an essential part. 

The situation is different in the first proof of Theorem 43, since here 
we are considering divisibility by the special number 2. We need 
‘2 |a? —> 2 a’, and this can be proved by ‘enumeration of cases’ and 
without an appeal to Theorem 3. Since 


(2m+1)? = 4m?+4m-+1, 

the square of an odd number is odd, and the conclusion follows. 

Similarly, we can dispense with Theorem 3 in the proof of Theorem 
-44 for any special m and N. Suppose, for example, that m = 2, N = 5. 
We need ‘5|a? —> 5|a’, Now any number a which is not a multiple 
of 5 is of one of the forms 

5m-+1, 5m-+2, 5m+3, 5m-+4, 

and the squares of these numbers leave remainders 


l, 4, 4, 1 
after division by 5. 
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If m = 2, N = 6, we argue with 2, the smallest prime factor of 6, 
and the proof is almost identical with the first proof of Theorem 43. 
With m = 2 and 


N = 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 17, 18 
we argue with the divisors 
2, 3, 5, 2, 7, 4, 2, 11, 3, 13, 2, 3, 17, 2, 


the smallest prime factors of N which occur in odd multiplicity or, in 
the case of 8, an appropriate power of this prime factor. It is instructive 
to work through some of these cases; it is only when N is prime that the 
proof runs exactly according to the original pattern, and then it becomes 
tedious for the larger values of N. 

We can deal similarly with cases such as m = 3, N = 2, 3, or 5; but 
we confine ourselves to those which are relevant in §§ 4.5-6. 


4.5. A historical digression. There is a curious historical puzzle 
on which the preceding discussion throws a good deal of light. 

It is unknown when, or by whom, the ‘theorem of Pythagoras’ was 
discovered. “The discovery’, says Heath, ‘can hardly have been made 
by Pythagoras himself, but it was certainly made in his school? Pytha- 
goras lived about 570-490 B.c. Democritus, born about 470, wrote ‘on 
irrational lines and solids’, and ‘it is difficult to resist the conclusion 
that the irrationality of 2 was discovered before Democritus’ time’. 

It would seem that no extension of the theorem was made for over 
fifty years. There is a famous passage in Plato’s Theaetetus in which it 
is stated that Theodorus (Plato’s teacher) proved the irrationality of 


V3, V5,...5 


‘taking all the separate cases up to the root of 17 square feet, at which 
point, for some reason, he stopped’. We have no accurate information 
about this or other discoveries of Theodorus, but Plato lived 429-348, 
and it seems reasonable to date this discovery about 410-400. 

The question how Theodorus proved his theorems has exercised the 
ingenuity of every historian. It would be natural to conjecture that he 
used some modification of the ‘traditions1 method of Pythagoras, such 
as those which we discussed in the last section. In that case, since he 
cannot have known the fundamental theorem,} and it is unlikely that 

ł Sir Thomas Heath, A manual of Greek mathematics, 54-55. In what follows passages 
in inverted commas, unless attributed to other writers, are quotations from this book 


oP from the same writer’s A history of Greek mathematics. 
ł See Ch. XII, § 12.5, for gome further discussion of this point. 
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he knew even Euclid’s Theorem 3, he must have argued much as we 
argued at the end of § 4.4. 

Some historians, however, such as Zeuthen and Heath, have objected 
to this conjecture on other grounds. Thus Heath remarks that 
‘the objection to this conjecture as to the nature of Theodorus’ proof 
is that it is so easy an adaptation of the traditional proof regarding. 72 
that it would hardly be important enough to mention as a new discovery 
and that 


‘it would be clear, long before ¥17 was reached, that it is generally 
applicable .. .’; 


and regards these objections as ‘difficult to meet’. 
Zeuthen assumes 


‘(a) that the method of proof used by Theodorus must have been 
sufficiently original to call for special notice from Plato, and (b) that 
it must have been of such a kind that the application of it to each 
surd required to be set out separately in consequence of the variations 
in the numbers entering into the proofs’; 
and considers that 


‘neither of these conditions is satisfied by the hypothesis of mere 
adaptation to V3, V5,... of the traditional proof with regard to V2’, 


On these grounds he puts forward an entirely different hypothesis about 
the nature of Theodorus’ proof. 

The method of proof suggested by Zeuthen is most interesting, and 
his hypothesis may be correct. But it should be clear by now that (what- 
ever the historical truth may be) the reasons advanced by Zeuthen and 
Heath are quite unconvincing. To prove Theodorus’ theorems, as we 
proved them in § 4.4, and without assuming any general theorem such 
as Theorem 3, requires a good deal more than a ‘trivial’ variation of 
the Pythagorean proof. If Theodorus proved them thus, then his work 
fully satisfied Zeuthen’s criteria; jt was certainly original enough to 
‘call for special notice from Plato’, and it did require ‘to be set out 
separately’ in every case. By the time Theodorus had finished with 17, 
he may well have been quite tired; it would be what he had done and 
not what he had not done that should fill us with surprise. 


4.6. Geometrical proofs of the irrationality of 2 and v5. The 
proofs suggested by Zeuthen vary from number to number, and the 


variations depend at bottom on the form of the periodic continued 
t We give two examples of it in § 4.6. 
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fraction} which represents ~N. We take as typical the simplest case 
(N = 5) and the lowest case (N = 2). 


(a) N = 5. We argue in terms of 


x = $(v5—1). 
Then z? = l-x. 
Geometrically, if A B = 1, AC = x, then 
AC? = AB. CB 
T C l; C Ç B 
ee y 
Fie. 4 


and AB is divided ‘in golden section’ by C. These relations are funda- 
mental in the construction of the regular pentagon inscribed in a, circle 
(Euclid iv. 11). 

If we divide 1 by x, taking the largest possible integral quotient,, viz. 
1,{ the remainder is l-x = 2. If we divide x by 2*, the quotient is 
again 1 and the remainder is x-x? = 23, We next divide x? by 2%, and 
continue the process indefinitely; at each stage the ratios of the number 
divided, the divisor, and the remainder are the same. Geometrically, 
if we take CC, equal and opposite to CB, CA is divided at C, in the 
same ratio as AB at C,i,¢, in golden section; if we take C, C, equal and 
opposite to C, A, then C; C is divided in golden section at C,; and soon.|| 
Since we are dealing at each stage with a segment divided in the same 
ratio, the process can never end. 

It is easy to see that this contradicts the hypothesis of the rationality 
of x. If x is rational, then AB and AC are integral multiples of the same 
length 5, and the same is true of 

C, C = CB = AB-AC, C0 = ACG = AC-C,G 
ie. of all the segments in the figure. Hence we can construct an infinite 
sequence of descending integral multiples of 8, and this is plainly im- 
possible. 

(b) N = 2. This case is best treated by a two-dimensional argument. 

Let AB, AC be two sides of a unit square ABDC; take BD, = AB 
along the diagonal BC; and let the perpendicular to BC at D, meet 
AC in B,. The elementary properties of triangles show that 


AB, = B,D, = D,C. 


ł See Ch. X, § 10.12. f Sinco ġ < z< L 
li C,C, equal and opposite to C,C, C,C, equal rnd opposite to C,C,,... . The New 
segments defined are measured alternately to the left and the right. 
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We now complete the square A, B, D, C and repeat the construction, 
taking B,D, = A,B, B, D} = A, B, 
as indicated in the figure. Each square constructed is dissected in the 
same proportions, and the process cannot end. 


Fra. 5 


If ¥2 were rational, i.e. if AC and BC were integral multiples of the 
same length 8, the same would be true of 
A,B, = D,C = BC-BD,= BC-AC 
and of B, C = AC-AB, = AC—B, D, = AC-A, B,, 
and go, by repetition of the argument, of all the segments in the figure; 
and plainly we should arrive at the same contradiction as before. 


4.7. Some more irrational numbers. We know, after Theorem 
44, that v7, 3/2, 4/11, 
are irrational. After Theorem 45, 

x = V24v3 
is irrational, since it is not an integer and satisfies 
z'—10x2-1 = 0. 

We can construct irrationals freely by means of decimals or continued 
fractions, as we shall see in Chs. IX and X; but it is not easy, without 


theorems such as we shall prove in §§ 11.13-14, to add to our list many 
of the numbers which occur naturally in analysis. 

TuEoreM 46. log,, 2 is irrational. 
a 


This is trivial, since logy) 2 = ; 
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involves 2 = 10, which is impossible. More generally log, m is irra- 
tional if m and n are integers, one of which has a prime faotor which 
the other lacks, 

Turorneém 47. e is ¢rrational. 


Let us suppose e rational, so that e = a/b where a and bare integers. 
If k > b and 


I 1 ] 
a = ne-1-ġ -5 Bh 
then b k! and gis an integer. But 
1 1 I 1 1 
0 aoe meer eee ... Geaa es 
Seo Gaeta Bl Ge 


and this is a contradiction. 

In this proof, we assumed the theorem false and deduced that œ was 
(i) integral, (i) positive, and (iii) less than one, an obvious contradiction. 
We prove two further theorems by more sophisticated applications of 
the same idea. 

For any positive integer n, we write 


f = fl) a EED K 


where the c,, are integers. For 0 <x < 1, we have 
1 
(4.7.1) 0 < flx) < Pri 


Again f(0) =0 and f™(0) = 0 ifm < n or m > 2n. But, if n S m <2n, 


! 
f0) s5 Cm: 


an integer. Hence f(x) and all its derivatives take integral values at 
= 0. Since f(1—zr) = f(x), the game is true at z = 1. 
THEoREM 48. eY is irrational for every rational y Æ O. 
If y = hjk and e” is rational, so is e*¥ = eẹ*, Again, if e-h is rational, 
so is eh. Hence it is enough to prove that, if h is a positive integer, e* 


cannot be rational. Suppose this false, so that e* = ajb where a, b are 
positive integers. We write 


x) = hf (x) —h?e-1f'(x)+. hfe -D(x (x) + fex) 
so that P(0) and F°(1) are integers. We have 


{ch F(a)} = AAF (a) +F (a) = Aefa), 
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Hence b | hen+tehef(x) dx = ble F(x)]) = aF(1)—bF(0 
0 

an integer. But, by (4.7.1), 


i 2noh 
0 <b f hen+lehaf (x) dx < oe < l 
J ! 


for large enough n, a contradiction. 
Tueorem 49, 7 and 7? are irrational. 


Suppose T? rational, so that 7? = a/b, where a, b are positive integers. 
We write 


G(x) = innfe) nn- (ar) 4 22—4f Me) — ,.. + La, 
so that G(O) and G(1) are integers. We have 
a {G’ (x)sin xx == mG (x)cos me} 


{Q"(x)+0°G(a)}sin nz = 67°" +f(xr)sin rg 


ma” sin nx f(x). 


Hence 


k Q (x)sin re 3 
n far sin NË f(x) dz = | m 


an integer. But, by (4.7.1), 
1 
i ma” 
0 <7 f a” sin nx fix)da<— <1 
ns 
0 
for large enough n, a contradiction. 


NOTES ON CHAPTER IV 


§ 4.2. The irrationality of e and 7 was proved by Lambert in 1761; and that 
of e” by Gelfond in 1929. See the notes on Ch. XI. 

§§ 4.3-6. A reader intorested in Greek mathematics will find what biblio- 
graphical information he requires in Heath’s books referred to on p. 42. 

We do not give specific references, except when we quote Heath, nor attempt 
to assign Greek theorcms to their real discoverers. Thus we use ‘Pythagoras’ 
for ‘some mathematician of the Pythagorean school’. 

§ 4.3. Thoorem 45 is proved, in a more general form, by Gauss, D.A., § 42. 

§ 4.6. Our construction in the case N = 2 follows Rademacher and Toeplitz, 
15-17. 

§ 4.7. Our proof of Theorem 48 is based on that of Hermite (Œuvres, 3, 154) 
and our proof of Theorem 49 on that of Niven (Bulletin Amer. Math. Soc. 53 
(1947), 509). 


V 
CONGRUENCES AND RESIDUES 


5.1. Highest common divisor and least common multiple. We 
have already defined the highest common divisor (a, b) of two numbers 
aand b. There is a simple formula for this number. 

We denote by min(z, y) and max(z, y) the lesser and the greater of 
x and y. Thus min(1,2) = 1, max(1,1) = 1. 


Teorem 50. If a=[[p* («> 0),t 
-P 


and b=] pf (6 ÈO), 
p 
then (a, b) = II pminte,p), 
p 


This theorem is an immediate consequence of Theorem 2 and the 
definition of (a, b). 

The least common multiple of two numbers a and b is the least positive 
number which is divisible by both a and b. We denoteitby {a, b}, so that 
a| {a,b}, . b | {a,b}, 
and {a, b} is the least number which has this property. 

TuzoreM 51. In the notation of Theorem 50, 


{a, b} = I prex), 
p 


From Theorems 50 and 51 we deduce 


ab 
THEOREM 52 : {a,b} = Gay 
If (a, b) = 1, a and b are said to be prime to one another or coprime. 
The numbers a, b, c ,,.,, k are said to be coprime if every two of them 
are coprime. TO say this is to say much more than to say that 
(a,b, pa, k) =, 
which means merely that there is no number but 1 which divides all 
of a, b, € psy k. 
+ The symbol Ii sf) 
denotes a product extanded over all pme values of p. The symbol 


Jir») 


denotes a product extended over all primes which divide m, In the first formula of 
Theorem 50, qis zero unless p a (so that the product is really a finite product). We 
might equally well write 

ght equally pres TI, pa 


in this cage every qwould be be positive. ” 
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We shall sometimes say that ‘a and b have no common factor’ when 
we mean that they have no common factor greater than 1, i.e, that 
they are coprime. 


5.2. Congruences and classes, of residues. If m is a divisor of 

x-a, we say that x is congruent to a to modulus m, and write 
x = a (modm). 

The definition does not introduce any new idea, since ‘x = a (modmy 
and ‘m x-a’ have the same meaning, but each notation has its ad- 
vantages. We have already used the word ‘modulus’ in a different sense 
in § 2.9, but the ambiguity will not cause any confusion. f 

By x Æ a (modm) we mean that x is not congruent to a. 

If x = a (modm), then a is called a residue of x to modulus m. If 
0 <a < m-l, then a is the least residuet of x to modulus m. Thus two 
numbers a and b congruent (modm) have the same residues (modm). 
A class of residues (modm) is the class of all the numbers congruent to 
a given residue (modm), and every member of the class is called a 
representative of the class. It is clear that there are in all m classes, 
represented by 04,9 eel 


These m numbers, or any other set of m numbers of which one belongs 
to each of the m classes, form a complete system of incongruent residues 
to modulus m, or, more shortly, a complete system (modm). 

Congruences are of great practical importance in everyday life. For 
example, ‘today is Saturday’ is a congruence property (mod 7) of the 
number of days which have passed since some fixed date. This property 
is usually much more important than the actual number of days which 
have passed since, say, the creation, Lecture lists or railway guides 
are tables of congruences; in the lecture list the relevant moduli are 
365, 7, and 24. 

To find the day of the week on which a particular event falls is to 
solve a problem in ‘arithmetic (mod7)‘. In such an arithmetic con- 
gruent numbers are equivalent, so that the arithmetic is a strictly 
finite science, and all problems in it can be solved by trial. Suppose, 
for example, that a lecture is given on every alternate day (including 
Sundays), and that the first lecture occurs on a Monday. When will a 
lecture first fall on a Tuesday ? If this lecture is the (x-+-1)th then 


2x = 1 (mod7); 


t The dual use bas a purpose because the notion of a ‘congruence with respect to 
a modulus of numbers’ occurs at a later stage in the theory, though we shall not use it 
in this book. t Strictly, least non-negative residue. 
6581 E 
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and we find by trial that the least positive solution is 
x= 4. 


Thus the fifth lecture will fall on a Tuesday and this will be the first 
that will do so. 


Similarly, we find by trial that the congruence 
x? = 1 (mod 8) 
has just four solutions, namely 
x = 1, 3, 5, 7 (mod8). 
It is sometimes convenient to use the notation of congruences even 


when the variables which occur in them are not integers. Thus we may 
write 


x = y (modz) 
whenever x-y is an integral multiple of z, So that, for example, 
3 = 4 (mod 1), —n = n (mod 27). 


5.3. Elementary properties of congruences. It is obvious that 
congruences to a given modulus m have the following properties: 
Ga =b—b =a, 
GD a =b. b =c ->a =c, 
Gii) a =a’. b = b => a+b =a’ +b’. 
Also, if a =a’, b = b’,... we have 
üv) kKa+lb+...=ka’+lb’+.... 
WY) a = a”, B= qa”, 
and so on; and finally, if (a, b,...) is any polynomial with integral 
coefficients, we have 

(vi) Ha, b ,...) = $@, b). 

THEOREM 53. Ifa =b (modm) anda = b (modn), then 

a =b (mod{m, n}). 
In particular, if (m, n) = 1, then 
a =b (modmn), 

This follows from Theorem 50. If p° is the highest power of p which 
divides {m,n}, then p° mor p° nand so p° (a-b). This is true for every 
prime factor of {m,n}, and so 

a =b (mod {m, n}). 


The theorem generalizes in the obvious manner to any number of 
congruences. 
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5.4. Linear congruences. The properties (i)-(vi) are like those of 
equations in ordinary algebra, but we soon meet with a difference, It 
is not true that ka =ka' + a =a’: 
for example 2.2 = 2.4 (mod 4), 
but 2 #4 (mod4). 

We consider next what is true in this direction. 


Tuzoreém 54. If (k,m) = a, then 
ka = ka’ (modm) — a =a’ [moa 
and conversely. 
Since (k,m) = d, we have 
k = kd, m= md, (k,,m,) = 1. 
Then ka-ka? = laa) 
and, since (k,,m,) = 1, 
m ka-ka’ =m, a—a'.t 
This proves the theorem. A particular case is 
Teorem 55. Jf (k,m) = 1, then 


ka = ka’ (modm) -> a = a’ (mod m) 
and conversely. 


Tueorem 56. Jf a,, Gg... a, ts a complete system of incongruent 
residues (modm) and (k, m) = 1, then ka,, ka 3,..., ka, is also such a 
system. 


For ka;—ka; = 0 (modm) implies a;—a; = 0 (modm), by Theorem 
55, and this is impossible unless 7 = j. More generally, if (k,m) = 1, 
then ka,+l (r=1,2,3,...,m) 
is a complete system of incongruent residues (modm). 

Teorem 57. If (k,m) = d, then the congruence 
(5.4.1) kx = 1 (modm) 
is soluble if and only if d|1. It has then just d solutions. In particular, 
if (k, m) = 1, the congruence has always just one solution. 

The congruence is equivalent to 

ka—my = l, 


t ‘= is the symbol of logical equivalence: if P and Q are propositions, then P = Q 
if P > Q and R >P. 
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so that the result is partly contained in Theorem 25. It is naturally 
to be understood, when we say that the congruence has ‘just d’ solu- 
tions, that congruent solutions are regarded as the same, 

If d = 1, then Theorem 57 is a corollary of Theorem 56. If cl > 1 
the congruence (5.4.1) is clearly insoluble unless d |l. If d |}, then 

m=dm', k = dk’, l= dl’, 

and the congruence is equivalent to 
(5.4.2) k’x = I' (modm’). 
Since (k’,m’) = 1, (5.4.2) has just one solution. If this solution is 


$ 


ll 


x = t (modm’), 
then x= tym’, 
and the complete set of solutions of (5.4.1) is found by giving y all 
values which lead to values of t- ym incongruent to modulus m. Since 
ttym’ =t+zm(modm) =m m'(y—z)=d (y-2), 
there are just d solutions, represented by 
t, ttm’, t+2m’,.... t+d—lmw. 
This proves the theorem. 


5.5. Euler’s function ¢(m). We denote by ¢(m) the number of 
positive integers not greater than and prime to m, that is to say the 
number of integers n such that 

0O<n<m, (n, m) = 1.F 
If a is prime to m, then so is any number x congruent to a (modm). 
There are ¢(m) classes of residues prime to m, and any set of ¢(m) 
residues, one from each class, is called a complete set of residues prime 
to m. One such complete set is the set of ¢(m) numbers less than and 
prime to m. 

Teorem 58. If a,, Qo., Agm IS a complete set of residues prime to 
m, and (k,m) = 1, then 

kay, ky, . . . kA gin) 
is also such a set. 

For the numbers of the second set are plainly all prime to m, and, 
as in the proof of Theorem 56, no two of them are congruent. 


Teorem 59. Suppose that (m,m')= 1, and that a runs through a 
complete set of residues (modm), and a’ through a complete set of residues 
(modm’). Then a'm+am' runs through a complete set of residues 
(modmm’). 

t n can be equal to m only when n = 1. Thus (1) = 1. 
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There are mm’ numbers a’m-+am’. If 


a,mta,m' = a,m+ta,m’ (modmm’), 
1 1 2 2 


then a, m = @ m (mod m), 
and s0 a, = a, (modm); 
and similarly a, = a, (modm’). 


Hence the mm’ numbers are all incongruent and form a complete set of 
residues (mod mm’). 
A function f (m) is said to be multiplicative if (m, m’) = 1 implies 
f(mm’) = f(m)f(m’). 

THEOREM 60. $(n) is multiplicative. 

If (m,m’) = 1, then, by Theorem 59, a’m-+-am’ runs through a com- 
plete set (modmm’) when a and a’ run through complete sets (modm) 
and (mod m’) respectively. Also 

(a’mt+am’,mm’) = 1 = (a’mt+am’, m) = 1. (a’m+am’,m’) = 1 

= (am, m) =1.(a’m,m')=1 

= (a,m) = 1 . (a’,m’)) = 1. 
Hence the ¢(mm’) numbers less than and prime to mm’ are the least 
positive residues of the $(m)¢(m’) values of a’m-+am' for which a is 
prime to m and a’ to m; and therefore 

d(mm’) = o(m)d(m’). 

Incidentally we have proved 

THEOREM 61. If (m, m) = 1, a runs through a complete set of residues 
prime to m, and a’ through a complete set of residues prime to m’, then 
am'’+a’m runs through a complete set of residues prime to mm’. 

We can now find the value of ¢(m) for any value of m. By Theorem 
60, it is sufficient to calculate ¢(m) when m is a power of a prime. Now 


there are p°—1 positive numbers less than p°, of which p*-!—] are 
multiples of p and the remainder prime to p. Hence 


At) apa a p(1-3); 


and the general value of ¢(m) follows from Theorem 60. 
THEOREM 62. If m=]] p°, then 


d(m) = m | | (1-5) 


pim 
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We shall also require 


THEOREM @. > ¢(d) =m. 
d|m 


If m = [Į #, then the divisors of m are the numbers d= [[ p”, 
where 0 < ¢' <c for each p; and 


Dim) = > AD = ZIT dP) = IJ (HAPHE). +e), 
by the multiplicative property of 4(m), But 


1+9(p) tte) = 14(p—)+p(p—-l) t+... Hpo) = p, 
so that O(m) = [p = m. 
p 


5.6. Applications of Theorems 59 and 61 to trigonometrical 
sums. There are certain trigonometrical sums which are important in 
the theory of numbers and which are either ‘multiplicative’ in the sense 
of § 5.5 or possess very similar properties. 

We writet e(r) = ett; 


we shall be concerned only with rational values of 7, It is clear that 


G- 


when m = m’ (modn). It is this property which gives trigonometrical 
sums their arithmetical’importance. 

(1) Multiplicative property of Gauss’s sum. Gauss’s sum, which is 
particularly important in the theory of quadratic residues, is 


n= n-1 2 
S(m, n) = 5 e2mih?min — > (= 


h=0- poo. 
2 
Since (ee Z (=) 
n n 
2 2 
for any r, we have em = (2) 
n n 


whenever A; = h, (modn). We may therefore write 


S(m,n) = > (5, 
(n) 


the notation implying that h runs through any complete system of 


ł Throughout this section ez is the exponential function et = 1 č- ... of the complex 
variable £. We assume a knowledge of the elementary properties of the exponential 
function. 
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residues modn. When there is no risk of ambiguity, we shall write h 
instead of h(n). 

THEOREM 64. If (n,n’) = 1, then 

S(m, nn’) = S(mn', n)S(mn, n’). 

Let h, h’ run through complete systems of residues to modulus n, n’ 

respectively. Then, by Theorem 59, 
H = hn’+h’n 

runs through a complete set of residues to modulus nn’. Also 


mH? = m(hn’+h'n)? = mh?n’?-+-mh'n? (modnn’). 
Hence 


S(mn', n)S(man, n) = | > ne } 5 (tem) 
= sde imm Sa ae a 


ih ig ne qe 


= À (=) = S(m, nn’). 


(2) Multiplicative property of Ramanujan’s sum. Ramanujan’s sum is 


the notation here implying that h runs only through residues prime to 
q. We shall sometimes write h instead of h*(q) when there is no risk 
of ambiguity. 

We may write c,(m) in another form which introduces a notion of 
more general importance. We call p a primitive q-th root of unity if 
p =1 but p” isnot 1 for any positive value of r less than q 

Suppose that p?==1 and that r is the least positive integer for which 
#=1. Then q= kr+s, where 0 < s< r. Also 


p = pa = l, 
so that s=Oandr|q. Hence 


Teorem 65. Any q-th root of unity is a primitive r-th root, for some 
divisor r of q. 
Torem 66. Theq-th roots of unity are the numbers 


e(5) E EE 
q 


and a necessary and sufficient condition that the root should be primitive 
is that h should be prime to q. 
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We may now write Ramanujan’s sum in the form 
cm) = > p”, 
where p runs through the primitive qth roots of unity. 
Tueorem 67. If (q,g') = 1, then 


c, 


aM) = C,(m)c,(m). 


For 
h K mie) 
CAMC AM] = e:m|— <3 = €\———_,----} = C,,, 5 
ake) >, | rad 2 | qq am 
by Theorem 61. 


(3) Multiplicative property of Kloosterman’s sum. Kloosterman’s sum 
(which is rather more recondite) is 


h 
where A runs through a complete set of residues prime to n, and f is 
defined by hh =1 (modn). 


Theorem 57 shows us that, given any h, there is a unique h (modn) 
which satisfies this condition. We shall make no use of Kloosterman’s 
sum, but the proof of its multiplicative property gives an excellent 
illustration of the ideas of the preceding sections. 


Torm @ if (n,n) = 1, then 
S(u,v,n)S(u, v',n’) = S(u, V,nn’), 


where V = vn'2+0'n?. 
If hh = 1 (modn), kh = 1 (modn’), 
then ‘ ; 
, ty uh-v uh’ +0’ f 
(5.6.1) S(u, v, n)S(u, v n’) = Pr eae eer a 
= efaa) ee 
im nn nn 
= > 7 uH + 
E Pa nn’ P 
where H = hn’+h’n, K = vhn'+v'h'n. 


By Theorem 61, H runs through a complete system of residues prime 
tonn’. Hence, if we can show that 
(5.6.2) K = VĒ (modnn’), 


where H is defined by ime 
HH = 1(modnn’), 
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then (5.6.1) will reduce to = 
S(u, v, n)S(u, v, n’) = > (Er 
Now (hn'+h'n)H = HH = 1 (modnn’. 
Hence hn'H = 1 (modn), n'H = hhn'H = h (modn), 
and so 


= S(u, V,nn’). 


(5.6.3) n'2H = n'h (modnn’). 
Similarly we see that 
(5.6.4) n2H = nh’ (modnn’); 


and from (5.6.3) and (5.6.4) we deduce 
VH = (un? +0'n2)H = on'h+v'nh' = K (modnn’). 
This is (5.6.2), and the theorem follows. 


5.7. A general principle. We return for a moment to the argu- 
ment which we used in proving Theorem 65. It will avoid a good deal 
of repetition later if we restate the theorem and the proof in a more 
general form. We use P(a) to denote any proposition asserting a 
property of a non-negative integer q, 

THEOREM 69. If 

(i) P(a) and P(b) imply P(a+b) und P(u-b), for every a und b (prọ- 
vided, in the second case, thut b < uj, 

(i) r is the least positive integer for which P(r) is true, 
then 

(a) P(kr) is true for every non-negative integer k, 

Œ any q for which PQ) is true is a multiple of r, 

In the first place, (a) is obvious. 

TO prove (b) we observe that 0 < r<q, by the definition of r, Hence 


we can write g= krt+s, s = qkr, 


„where k > 1 and0 <s < r, But P(r) —> P(kr), by (a), and 
Pq). P(kr) > P(s), 
by (i). Hence, again by the definition of r,s must be 0, and q = kr. 

We can also deduce Theorem 69 from Theorem 23. In Theorem 65, 
Pu) is p = 1. 

5.8. Construction of the regular polygon of 17 sides. We con- 
clude this chapter by a short excursus on one of the famous problems 
of elementary geometry, that of the construction of a regular polygon 
of n sides, or of an angle «= 2z/n. 
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Suppose that (n,, n,) = 1 and that the problem is soluble for n = nı 
and for n = ng There are integers r, and 7, such that 


Tiny trona = 1 
Qa 
or 1 Og +1, % = nea =e. 


Hence, if the problem is soluble for n = n, and n = ną, it is soluble 
for n = nana. It follows that we need only consider cases in which n 
is a power of a prime. In what follows we suppose n = p prime. 

We can construct « if we can construct cosa (or sin«); and the 
nee coska+isinka (k = 1,2,....n—1) 
are the roots of 


(5.8.1) Z = amltyn21 41 = O. 


Hence we can construct « if we can construct the roots of (5.8.1). 

‘Euclidean’ constructions, by ruler and compass, are equivalent 
analytically to the solution of a series of linear or quadratic equations.t 
Hence our construction is possible if we can reduce the solution of 
(5.8.1) to that of such a series of equations. 

The problem was solved by Gauss, who proved (as we stated in § 2.4) 
that the reduction is possible if and only if n is a ‘Fermat prime’ t 

n = p = 241 = P. 
The first five values of h, viz. 0, 1, 2, 3, 4, give 
n = 3, 5, 17, 257, 65537, 
all of which are prime, and in these cases the problem is soluble. 

The constructions for n = 3 and n = 5 are familiar. We give here 
the construction for n = 17. We shall not attempt any systematic 
exposition of Gauss’s theory; but this particular construction gives a 
fair example of the working of his method, and should make it plain 
to the reader that (as is plausible from the beginning) success is to be 
expected when n = pand p— 1 does not contain any prime but 2. 
This requires that p is a prime of the form 2”--1, and the only such 
primes are the Fermat primes.11 

Suppose then that n = 17. The corresponding equation is 


gi} 


(5.8.2) —- = abfalt tl = 0. 


+ See § 11.5. t See § 2.5. || See § 2.5, Theorem 17, 
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We write a = cil ep = efi) = œs ka+ i sin ka, 
so that the roots of (5.8.2) are 
(5.8.3) X = €l, Eg pee Elg. 
From these roots we form certain sums, known as periods, which are 
the roots of quadratic equations. 
The numbers 3” (0 <m < 15) 
are congruent (mod 17), in some order, to the numbers k = 1, 2,...,16,t 
as is shown by the table 
m = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 
k = 1, 3, 9, 10, 13, 5, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6. 
We define x, and x, by 


Tı = $ ey = etbeg beg tersterstestey ter, 
L= D e = Cg te teste tetertee tee; 
and ¥;, Yo) Ya: Y4 by 
Yi rage = etet Et Ep 


Y2 = Ek = €gt€y5-+€gt €, 
m= 2(mod 4) 
Y3= Ek = E3 tes HEt en 
m=1(mod 4) 
Yq = È & = fot euterteg: 
m= 3(mod 4) 
Since Eert E7-p = 2 coska, 


h 
WETS 2, = 2(cos q + cos 8a + cos 4 + cos 2a), 


Z, = 2(cos 3a + cos 7x + COS 5a + COS 6a), 
Y, = 2(cosa + COS 4a), Y = 2(cos 8a+ COS 2a), 
Yz = 2(cos3a+COS 5x), y4 = 2(cos Ta + COS 6a), 
We prove first that x, and x, are the roots of a quadratic equation 


with rational coefficients. Since the roots of (5.8.2) are the numbers 


(5.8.3), we have 

8 16 
+t, = 25 coska = Xe, =- —1. 
Kat KZI 


Again, 
X1 X= 4(cos æ + COS 8x + COS 4x + COS 2x) x 
X (cos 3a + COS 7a -l- COS da + COS 6x). 
If we multiply out the right-hancl side and use the identity 
(5.8.4) 2cosmacosna = cos(m+n)a+cos(m—n)a, 
t In fact 3 is a ‘primitive root of 17 in the sense which will be explained in § 6.8. 
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we obtain Hy, = A(z) = -4. 
Hence x, and x, are the roots of 

(5.8.5) eta—4=0. 
Also 


cosa + cos 2a > 2cos}m = y2 > —cos 8a, cos 4a > 0. 
Hence x, > 0 and therefore 
(5.8.6) Ly > ty 


We prove next that y,, Y and Yz, y, are the roots of quadratic equa- 
tions whose coefficients are rational in x and z,. We have 


oo ; YY: = Ti 
and, using (5.8.4) again, 


Yı Yz = 4(cos a + COS 4a) (cos 8a +- COS 2a) 
8 
= 2% coska = -1. 
k=1 


Hence y4, Ya are the roots of 


(5.8.7)’ y?—x, y-l = 0; 
and it is plain that 
(5.8.8) yı > Yo 
Similarly YstYa = Vo; YY = —l, 
and SO Ys, Y4 are the roots of 
(5.8.9) y?—x, y-l = 0, 
and 
(5.8.10) Y3 > Yq. 
Finally 


2cosa+2cos4a = yı, 
4COS x COS 40 = 2(cos 5a+cos 3x) = Yz. 


Also COS g > COS 4g, Hence z, = 2 COS g and z, = 2 COS 4y are the roots 
of the quadratic 


(5.8.11) 2—y,2zty, = 0 
and 
(5.8.12) Z > Zg 


We can now determine z, = 2 cosa by solving the four quadratics 
(5.8.5), (5.8.7), (5.8.9), and (5.8.11), and remembering the associated 
inequalities. We obtain 

2cosa = 4{—14V17+.4/(34—2V17)}+ 
+4,/(68+ 12V17—16)(34+ 2V17)—2(1—v17)/(34—2v17)}, 
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an expression involving only rationals and square roots. This number 
may now be constructed by the use of the ruler and compass only, and 
so q may be constructed. 

There is a simpler geometrical construction. Let C be the least 
positive acute angle such that 

tan 40 := 4, 
so that C, 2C, and 4C are all acute. Then (5.8.5) may be written 
x?+4xcot 4C'—4 = 0. 


BR, 


P; 


Ns F 0E N3 A 


The roots of this equation are 


2 tan 2C, —2 cot 20, 
Since 2, > a», this gives 
x, = 2tan2C, Ly = -2cot 2c. 
Substituting in (5.8.7) and (5.8.9) and solving, we obtain 
y, = tan(C+ 47), Yq = tan C, 
Yo = tan(C— 4), Y, = —cot c. 
Hence 


2 cos 3a + 2 cos 5a = yz = tan C, 

(5.8.13) 
2 œs 3a. 2 œs 5x = 2 œs Ia + 2 œs 8a =Y; = tan(C—+}r). 

Now let OA, OB (Fig. 6) be two perpendicular radii of a circle. Make 
OI one-fourth of OB and the angle OIE (with E in QA) one-fourth of 
the angle OIA. Find on AO produced a point F such that EIF = 4r. 
Let the circle on AF as diameter cut OB in K, and let the circle whose 
centre is E and radius EK cut OA in N, and N, (N, on OA, N; on AO 
produced). Draw N, P}, N; P, perpendicular to OA to cut the circum- 


ference of the original circle in P, and P}. 
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Then OIA = 4C and OIE = C. Also 
2cos AOP;+2cos AOP, = pO AE ee tanC, 


OA 0A OL 
= ON, ON, _ OR? 
2cos AOP,.2cos AOF, = —4 ery baa —ton 
OF OF 
=— SA = Or = tan(C— łn). 


Comparing these equations with (5.8.13), we see that AOP, = 3a and 
AOP; = 5a, 

It follows that A, P, P, are the first, fourth, and sixth vertices of a 
regular polygon of 17 sides inscribed in the circle; and it is obvious how 
the polygon may be completed. 


NOTES ON CHAPTER V 


§ 5.1. The contents of this chapter are all ‘classical’ (except the properties of 
Ramanujan’s and Kloosterman’s sums proved in § 5.6), and wij] be found in 
text-books. The theory of congruences was first developed scientifically by Gauss, 
D.A., though tho main results must have been familiar to earlier mathematicians 
such as Fermat and Euler. We give occasional references, especially when some 
famous function or theorem is habitually associated with the name of a particular 
mathematician, but make no attempt to be systematic. 

§ 5.5. Euler, Novi Comm. Acad, Petrop. 8 (1760-1), 74-104 [Opera (1), ii. 
531-44]. 

It might seem more natural to say that f(m) is multiplicative if 

fimm) = flm)f(m’) 
for all m, m’. This definition would be too restrictive, and the less exacting 
definition of the text is much more useful. 

§ 5.6. The sums of this section occur in Gauss, ‘Summatio quarumdam 
serierum singularium’ (1808), Werke, ii. 1 1-45; Ramanujan, Trans. Camb. Phil. 
Sec. 22 (1918), 259-76 (Collected Papers, 179-99); Kloosterman, Acta Math. 49 
(1926), 407-64. ‘Ramanujan’s gum’ may be found in earlier writings; see, for 
example, Jensen, Beretning d. tredje Skand. Matematikercongres (1913), 145, and 
Landau, Handbuch, 572: but Ramanujan was the first mathematician to see its 
full importance and use it systematically. It is particularly important in the 
theory of the representation of numbers by sums of squares. 

§ 5.8. The general theory was developed by Gauss, D.A., §§ 335-66. The first 
explicit geometrical construction of the l7-agon was made by Erchinger (see 
Gauss, Werke, ii. 186-7). That in the text is due to Richmond, Quarterly Journal 
of Math. 26 (1893), 206-7, and Math. Annalen, 67 (1909), 459-61. Our figure is 
copied from Richmond's. 

Gauss (D.A., § 341) proved that the equation (5.8.1) is irreducible, i,e, that 
its left-hand side cannot be resolved into factors of lower degree with rational 
coefficients, when n is prime. Kronecker and Eisenstein proved, more generally, 
that the equation satisfied by the gin) primitive mth roots of unity is irreducible; 
see, for example, Mathews, 186-S. Grandjot has shown that the theorem can be 
deduced verysimply from Dirichlet’s Theorem 15: see Landau, Vorlesungen, iii. 2 19. 
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FERMAT’S THEOREM AND ITS CONSEQUENCES 


6.1. Fermat's theorem. In this chapter we apply the general ideas 
of Ch. V to the proof of a series of classical theorems, due mainly to 
Fermat, Euler, Legendre, and Gauss. 


THEOREM 70. If p is prime, then 
(6.1.1) a? = a (modp). 

THEOREM 71 (Fermat's THEOREM). If p is prime, and p { a, then 
(6.1.2) aP-1=1 (modp). 

The congruences (6.1.1) and (6.1.2) are equivalent when p f a; and 
(6.1.1) is trivial when p a, since then q? =0 = a. Hence Theorems 
70 and 71 are equivalent. 

Theorem 71 is a particular case of the more general 

THEOREM 72 (THE FERMAT-EULER TERM). Jf (a,m) = 1, fhen 

atm) = 1 (modm). 
If x runs through a complete system of residues prime to m, then, by 


Theorem 58, az also runs through such a system. Hence, taking the 
product of each set, we have 


II (ax) = JT] x (modm) 
or at) TT x = TI z (modm). 
Since every number x is prime to m, their product is prime to m; and 


hence, by Theorem 55, 
av) = 1 (mod m). 


The result is plainly false if (a, m) > 1. 


6.2. Some properties of binomial coefficients. Euler was the 
first to publish a proof of Fermat’s theorem. The proof, which is easily 
extended SO as to prove Theorem 72, depends on the simplest arith- 
metical properties of the binomial coefficients. 


THEOREM 73. Jf m and n are positive integers, then the binomial 
coefficients 

m m(m-1)...(m-n+1) --m _ ( pe e ... (mtn-1) 
onl n! i no n! 
are integers. 
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It is the first part of the theorem which we need here, but, since 


Da 


the two parts are equivalent. Either part may be stated in a more 
striking form, viz. 

Teorem 74. The product of any n successive positive integers is 
divisible by nl. 

The theorems are obvious from the genesis of the binomial coefficients 
as the coefficients of powers of x in (1-+g)(1-}-x)... or in 

(l—ax)-1—a),.. = (lt-a+a2+...)(Lta+a?+..).. 

We may prove them by induction as follows. We choose Theorem 74, 


which asserts that 
(m), = m(mtl)... (m+n-1) 


is divisible by n. This is plainly true for ņ = 1 and all m, and also for 
m= 1 andalln. We assume that it is true (a) for n = N-1 and all 
m and (b) for n = N and m = M. Then 


(M+ 1)y—My = N(M+ Vy, 
and (M+1),_, is divisible by (N-l)!. Hence (M+), is divisible by 
N!, and the theorem is true for n = N and m = M+ 1. It follows that 
the theorem is true for n = N and all m. Since it is also true for 
n= N+1 and m = 1, wẹ can repeat the argument; and the theorem 
is true generally. 


THEoREM 75. If p is prime, then 


are divisible by p. 
If 1 <n < p—l, then 
n!|p(p—1)...(p—n+1), 
by Theorem 74. But n/ is prime to p, and therefore 
n! (p—1)(p—2)...(p—n-+1). 
A 5 pe 1)(p—2)...(p—n+]) 


Hence 
n n! 


is divisible by p. 


TuEorEM 76. If p is prime, then all the coefficients in (1-x)-p are 
divisible by p, except those of 1, x”, x?P,..., which are congruent to 1 (mod p). 
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By Theorem 73, the coefficients in 
foo} 


(I-x)-P = 2 eee 


1 


=1 
are all integers. Since 


(la?) = ]taPte%t..., 
we have to prove that every coefficient in the expansion of 
(l—z?)-1_(l—2)-? = (1—2)-?(1—x?)-{(1—x)P—1+2?} 


is divisible by p. Since the coefficients in the expansions of (1—x)-? 
and (1 —zx?)-1 are integers it is enough to prove that every coefficient 
in the polynomial (1—z)?—1-L2? is divisible by p. For p = 2 this is 
trivial and, for p > 3, it follows from Theorem 75 since 


(l—x)?>—1+-a? = > orf). 


We shall require this theorem in Ch. XIX. 
Torm 77. Ifp is prime, then 

(a+y+...tw)P = xP-+-yP-+...4-w? (modp). 
For (x+y)? = xP-+y? (modp), 


by Theorem 75, and the general result follows by repetition of the 
argument. 
Another useful corollary of Theorem 75 is 


Torm 78. Ifa > 0 and 
m = 1 (mod p”), 
then m? = 1 (mod p%+). 
For m = 1+kp*, where k is an integer, and op >a+1. Hence 
m? = (1+kp*)? = 1+p%ti, 


where 7 is an integer. 


6.3. A second proof of Theorem 72. We can now give Euler’s 
proof of Theorem 72. Suppose that m = [| p*. Then it is enough, 
after Theorem 53, to prove that 
ae) = 1 (mod p’). 

F 


6691 
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But $m) = TI $(p%) = TT p**(—]), 
and SO it is sufficient to prove that 
aP*P-l) = 1 (mod p?) 
when p { a. 
By Theorem 77, 
(aty+...)? = +y +... (mod p). 


Taking x = y = z=... = 1, and supposing that there are a numbers, 


we obtain a? =a (mod p), 


or aP-1 = 1 (mod p). 
Hence, by Theorem 78, 
gh-) = ] (mod p?), a?*P-) = 1 (mod p'), 
qe“@-) = 1 (mod p®). 
6.4. Proof of Theorem 22. Before proceeding to the more impor- 
tant applications of Fermat’s theorem, we use it to prove Theorem 22 


of Ch. II. 
We can write f(n) in the form 


m m ar 
fin) = > Ont = X (Yc, lar, 

r=1 r=1 ‘s=0 
where the a and ¢ are integers and 

1<a,<a,<...<a,. 
The terms off(n) are thus arrangea in increasing order of magnitude 
for large n, and f(n) is dominated by its last term 

© mgm 102 Am 


for large n (SO that the last c is positive). 
If f (n) is prime for all large n, then there is an n for which 


(n) = p>a 
and p is prime. Then f E "e 


{n+kp(p—1)}* = n (mod p), 
for all integral & and s. Also, by Fermat’s theorem, 
aP~1 = 1 (mod p) 
and so artkow-1) = qa? (mod p) 
for all positive integral k. Hence 
{n+ kp(p— 1)apt w- = paat (mod p) 
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and therefore fintkp(p—1)} = fin) = 0 (mod p) 
for all positive integral k; a contradiction. 


6.5. Quadratic residues. Let us suppose that p is an odd prime, 
that p { a, and that x is one of the :numbers 


1, 2,3 ,..., p-l. 
Then, by Theorem 58, just one of the numbers 
1.2, 2.2%... (p—Le 
is congruent to a (modp). There is therefore a unique 2’ such that 
xx’ = a (modp), O<a <p. 


We call x’ the associate of x. There are then two possibilities: either 
there is at least one x associated with itself, So that x’ = x, or there is 
no such x. 

(1) Suppose that the first alternative is the true one and that x, is 
associated with itself. In this case the congruence 


x? = a (modp) 
has the solution x = x,; and we say that a is a quadratic residue of p, 
or (when there is no danger of a misunderstanding) simply a residue 
of p, and writea R p. Plainly 
£ = p-x, = —xX, (modp) 
is another solution of the congruence. Also, if 7’ = x for any other 
value 2, of x, we have 


2 =a, g =a, (zi — z (xi +t) = z— z? 


0 (modp). 


Hence either z, = x, or 
£a = —%; = p-x,; 
and there are.just two solutions of the congruence, namely x, and p-x,. 


In this case the numbers 
1, 2,..., p-l 


may be grouped as 2, p-xi, and }(p—3) pairs of unequal associated 
numbers. Now 
x,(p—x,) = —aj = -a (modp), 


while xx’ = a (modp) 
for any associated pair x, x’, Hence 
(p-1)! = TI X = —a,qile-3) = —qip-)) (modp). 
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(2) If the second alternative is true and no x is associated with itself, 
we say that a is a quadratic non-residue of p, or simply a non-residue 
of p, and writea N p. In this case the congruence 


z? = a (modp) 
has no solution, and the numbers 
1, 2,..., p-l 
may be arranged in 4(p— 1) associated unequal pairs. Hence 
(p—1)! = [[ z = at» (modp). 
We define ‘Legendre’s symbol’ £ , where p is an odd prime and a is 


any number not divisible by øp, by 
(= +L if aRp, 
p 


3) -1, if aNp. 
P 


It is plain that (3 = (5) 
ifa = b (mod p). We have then proved 
Torm 79. If p is an odd prime and a is not a multiple of p, then 


(p—1)! = -Za (mod p). 


We have supposed p odd. It is plain that 0 = 02, 1 = 12, and s0 all numbers, 
are quadratic residues of 2. We do not define Legendres symbol when p = 2, 
and we ignore this case in what follows. Some of our theorems are true (but 
trivial) when p = 2. 


6.6. Special cases of Theorem 79: Wilson’s theorem. The 
two simplest cases are those in which a = 1 and a = — 1. 
(1) First let a = 1. Then 


x? = 1 (modp) 
has the solutions x = + 1; hence 1 is a quadratic residue of p and 
aan 
cp) 
If we put a = 1 in Theorem 79, it becomes 
Tueoreém 80 (WILSON’S THEOREM) 


(p-l)! = -1 (modp). 
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Thus 11 | 3628801. 
The congruence (p—1)!+1 = 0 (mod p?) 

is true for P=5, p == 13, p = 563, 

but for no other value of p less than 200000. Apparently no general theorem 


concerning the congruence is known. 


If m is composite, then 


m |(m—1)!41 
is false, for there is a number d such that 
d|m, l<d<m, 


and d does not divide (n—1)!+1. Hence we derive 
Tueorem 81. If m > 1, then a necessary and sufficient condition that 
m should be prime is that 
m (m—1)!+1. 
The theorem is of course quite useless as a practical test for the 
primality of a given number m. 
(2) Next suppose a = = 1. Then Theorems 79 and 80 show that 


(=| = —(—1)-0(p—1)! = (—1)@-2, 


Tuzorem 82. The number — 1 ¿s a quadratic residue of primes of the 
form 44-1 and a non-residue of primes of the form 4k-+3, 7.¢. 


(=) = (rye 


P 
More generally, combination of Theorems 79 and 80 gives 
Tuor 83 3 = ar- (modp). 
OP 


6.7. Elementary properties of quadratic residues and non- 
residues. The numbers 
(6.7.1) 12s BP ees {f(p—1)? 
are all incongruent; for r? = §? implies r = s or r = —s (modp), and 
the second alternative is impossible here. Also 

r? = (p—r)* (modp). 

It follows that there are 3(— 1) residues and }(p— 1) non-residues of p. 

Torm 84. There are 4(p—1) residues and 4(p—1) non-residues of 
an odd prime p. 

We next prove 

Tuzorem 85. The product of two residues, or of two non-residues, is 
a residue, whilethe product of a residue and a non-residue7s a non-residue. 
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(1) Let us write «a, «’, a, ,... for residues and $, B', Bı ,... for non- 
residues. Then every aq’ is an a, since 
x? =Q. Y? =a’ — (zy)? = aa’ (modp). 
(2) If a, is a fixed residue, then 
l.a 2.04, 3.04 yoy (D— I) ay 
is a complete system (modp). Since every aa, is a residue, every fo, 


must be a non-residue. 
(3) Similarly, if £ is a fixed non-residue, every Bf, is a residue. For 


1 Bi 2 Byes (p— 1)B, 


is a complete system (modpj, and every «aß, is a non-residue, so that 
every B§, is a residue. 

Theorem 85 is also a corollary of Theorem 83. 

We add two theorems which we shall use in Ch. XX. The first is 
little but a restatement of part of Theorem 82. 


Tuzorm 86. If p isa prime 4k-+ 1, then there is an x such that 


1+a? = mp, 
where 0 <m <p. 


For, by Theorem 82, -- 1 is a residue of p, and SO congruent to one 
of the numbers (6.7.1), say x?; and 


0< 142° < 14(dp)? < p. 
Tueorem 87. If p is an odd prime, then there are numbers x and y such 
ad 1+2?+y? = mp, 
where 0 < m < p. 


The 4(p+1) numbers 
(6.7.2) £? (0<2< Hp—1)) 
are incongruent, and so are the $(p+1) numbers 
(6.7.3) —1-y? (0 <y < Hp—}). 


But there are p + 1 numbers in the two sets together, and only p residues 
(modp); and therefore some number (6.7.2) must be congruent to some 
number (6.7.3). Hence there are an g and a y, each numerically less 
than 4p, such that 

v= —l—y', 1+22--y? = mp. 
Also 0 < 1+a*+y? < 1424p}? < p’, 
so that 0 < m < p. 

Theorem 86 shows that we may take y = 0 when p = 4k+ 1. 
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6.8. The order of a (mod m). We know, by Theorem 72, that 
avm) = | (modm) 
if (a, m) = 1. We denote by d the smallest positive value of x for which 
(6.8.1) aë = 1 (modm), 
so that d < $(m). 
We call the congruence (6.8.1) the proposition P(x). Then it is 
obvious that P(x) and P(y) imply P(w+y). Also, if y < x and 
a®-¥ = b (mod m), 
then a? = ba" (modm), 


so that P(x) and P(y) imply P(x--y). Hence P(x) satisfies the condi- 
tions of Theorem 69, and dim). 


We call d the ordert of a (modm), and say that a belongs to d (mod m). 
Thus 2=2 2=4 %=1 (mod7), 


and so 2 belongs to 3 (mod7). If d = ¢(m), we say that a is a primi- 
tive root of m. Thus 2 is a primitive root of 5, since 
2=2, 92 = 4, 23 == 3, 24=1 (mod 5); 

and 3 is a primitive. root of 17. The notion of a primitive root of m 
bears some analogy to the algebraical notion, explained in § 5.6, of a 
primitive root of unity. We shall prove in $7.5 that there are primitive 
roots of every odd prime p. 

We can sum up what we have proved in the form 

THEOREM 88. Any number a prime to m belongs (modm) to a divisor 
of ¢(m): if d is the order of a (mod m), then dj (m). If m is a prime p, 
then d (p- 1). The congruence g* == 1 (modm) is true or false according 
as x is or is not a multiple of d. 


6.9. The converse of Fermat’s theorem. The direct converse of 
Fermat’s theorem is false; it is not true that, if m f a and 
(6.9.1) a™-1 = 1 (modm), 
then m is necessarily a prime. It is not even true that, if (6.9.1) is true 


for all a prime to m, then m is prime.. Suppose, for example, that 
m = 561 = 3.11.17. If.3 fa, 11 fa, 17 / a, we have 


a? = 1 (mod3), a =1 (mod11), œ =1 (mod17). 


+ Often called the index; but this word has a quite different meaning in the theory 
of groupa. 
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by Theorem 71. But 2 |560, 10 |560, 16 | 560 and so gq = 1 to each 
of the moduli 3, 11, 17 and so to the modulus 3.11.17 = 561. 
For particular a we can prove a little more, viz. 


Tuzorem 89. For every a > 1, there is an infinity of composite m 
satisfying (6.9.1). 


Let p be any odd prime which does not divide a(a?—1). We take 


(6.9.2) wee a? —1\ faP+1 
a ~ æl \a—1/\at+i1/7 


so that m is clearly composite. Now 
(a?--1)(m—1) = a? —a? = a(aP-1—1)(aP +a). 


Since a and q” are both odd or both even, 2 (a?+a). Again a?-!—1 
is divisible by p (after Theorem 7 1) and by a?— 1, since p- 1 is even. 
Since p f (a?— 1}, this means that p(a?— 1) (a?-1--1). Hence 


2p(a?— 1) (a?—1)(m—1), 
so that 2p (m--l) and m = 1+2pu for some integral yu. Now, to 


modulus m, 
a? = 14m(a?—1) = 1, aml — qamu =] 


and this is (6.9.1). Since we have a different value of m for every odd 
p which does not divide a(a?— 1), the theorem is proved. 
.A correct converse of Theorem 71 is 


Tuzorem 90. Jf a™-! = 1 (modm) and a” Æ 1 (modm) for any 
diviaor x of m- 1 less than m- 1, then m is prime. 


Clearly (a,m) = 1. If d is the order of a (mqdm), then d | (m-l) and 
d |d(m) by Theorem 88. Since a? = 1, we must have d = m- 1 and 
so (m-l) ¢(m). But 


$m) = "i (d-i) < m-l 
pim 


if m is composite, and therefore m must be prime. 
6.10. Divisibility of 2°-!—1 by p?. By Fermats theorem 
2?-1_1 = 0 (modp) 
ifp > 2. Isit ever true that 
2Pp-1_ 1 = 0 (mod p?)? 
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This question is of importance in the theory of ‘Fermat’s last theorem’ 
(see Ch. XIII). The phenomenon does occur, but very rarely. 
Tuzorem 91. There is a prime p for which 
Qp-1_] =0 (mod p?). 

In fact this is true when p = 1093, as can be shown by straight- 
forward calculation. We give a shorter proof, in which all congruences 
are to modulus p? = 1194649. 

In the first place, since 

3 = 2187 = 2p+], 
we have 
(6.10.1) 34 = 4p+ 1. 


Next 914 = 16384 = 15p—ll, 28 = —330p+121, 


32,228 — —2970p+1089 = —2969p—4 = —1876p—4, 


and so 32.226 = —-469p— 1. 
Hence, by the binomial theorem, 
git 9182 — —(469p-+-1)7 = —3283p—1, 
and so 
(6.10.2) 314, 2182 — —4p— 1. 
From (6.10.1) and (6.10.2) it follows that 
314 2182 rai z314 2182 = =] 
and so 21092 = 1 (mod 10932). 


6.11. Gauss’s lemma and the quadratic character of 2. If p 
is an odd prime, there is just one residuef of n (modp) between —4p 
and 4p. We call this residue the minimal residue of n(modp); it is 
positive or negative according as the least non-negative residue of n lies 
between 0 and 4p or between 4p and p. 

We now suppose that m is an integer, positive or negative, not 
divisible by p, and consider the minimal residues of the 4(p— 1) numbes 
(6.11.1) m, 2m, 3m ,...,4(p—1)m. 

We can write these residues in the form 
Ty faye Ts Ty — Payer) Ths 
where = A-+y= 4(p—1), 0< r < 4p, 0< < 4p 


t Here, of course, ‘residue’ has its usual meaning and is not an abbreviation of 
‘quadratic residue’. 
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Since the numbers (6.11.1) are incongruent, no two r can be equal, and 
no two Y. If an f and an r are equal, say 7 = fjs let am, bm be the 
two of the numbers (6.11.1) such that 


am =%,bm =—1r;, (modp). 
Then am+bm = 0 (modp), 
and so a+b = 0 (modp), 


which is impossible because 0 < a < 4p, 0 < b < 4p. 
It follows that the numbers r;, t are a rearrangement of the numbers 


l, 2s 3(p—1); 
and therefore that 
2m...4(p—1)m = (—1)¥1.2...4(p—1) (modp) 
and so m- = (—1)# (modp). 
But (=) = m®-) (modp), 
p 


by Theorem 83. Hence we obtain 
THEOREM 92 (GAUSS’S mma) : = (— 1)”, where pis the number 
p 


of members of the set 
m, 2m, 3m ,...,3(p—1)m, 
whose least positive residues (modp) are greater than Qp. 
Let us take in particular m = 2, so that the numbers (6.11.1) are 
2, 4 yay PL 
In this case A is the number of positive even integers less than 4p. 

We introduce here a notation which we shall use frequently later. 
We write [z] for the ‘integral part of x’, the largest integer which does 
not exceed x. Thus a= [x}4+y, 
where 0 < f < 1. For example, 


fI=2, [y= 0, [-=-2 


With this notation A= [fp]. 
B u t à+ = HP-L), 
and s0 p= 3(p—1)—[ip]. 


If p =1 (mod 4), then 
p == (p—1)—Hp—1) = į(p—1) = (w+); 
and if p = 3 (mod 4), then 
p = Hp—1)—Hp—8) = (p+!) = [2(p+)]. 
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Hence = 2i(p—-]) = (—])#@+0) (mod p), 


P 
that is to say i- = 1, if p = 8n+1 or 8n—l, 
z 


= -1, if p = 8n+3 or 8n—3. 


P 
If p = 8n+1, then 4(p?—1) is even, while if p = 8n+3, it is odd. 
Hence (— 1)#@ +0) = (—1)to*-D, 

Summing up, we have the following theorems. 
THEOREM 93: A = (— LRH, 

oP 
THEOREM 94: 2 es ts 1K, 

cP 


Tuzorem 95. 2 is a quadratic residue of primes of the form 8n-+ 1 and 
a quadratic non-residue of primes of the form 8n-L3. 


Gauss’s lemma may be used to determine the primes of which any 


given integer m is a quadratic residue. For example, let us take m = — 3, 
and suppose that p > 3. The numbers (6.11.1) are 
-3a (1 < a < 4p), 

and pis the number of these numbers whose least positive residues lie 
between 4p and p. Now 

—3a = p—3a (modp), 
and p-3a lies between 4p and p if 1 <a < p. If tp <a < łp, then 
p-3a lies between 0 and 4p. Ifłp< a< $p, then 

-3a = 2p—3a (modp), 
and 2p— 3a lies between 4p and p. Hence the values of a which satisfy 
the condition are 


1, 2,.... [R2] 4e]+1, [3p] +2 Ee), 
and w= [tp]+[ip]—Lel. 
If p = 6n4+-1 then u = n+3n—2n is even, and if p = 6n+5 then 
u= nt(3n-+2)—(2n-+1) 
is odd. 


Tueornem 96. == 3 is a quadratic residue of primes of the form 6n-+- 1 
and a quadratic non-residue of primes of the form 6n+5. 
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A further example, which we leave for the moment} to the reader, is 

THEOREM 97. 5isa quadratic residue of primes of the form 10n-+-1] and 
a quadratic non-residue of primes of the form l0n+ 3. 

6.12. The law of reciprocity. The most famous theorem in this 
field is Gauss’s ‘law of reciprocity’. 

Torm 98, If p and q are odd primes, then 


where 


Since p’q’ is even if either p or q is of the form 4n+-1, and odd if both 
are of the form 4n+-3, we can also state the theorem as 
Tuzorem 99. If pand qare odd primes, then 


We require a lemma. 


-fa 
Tueorem 100.4 If S(q,p) = 2, [2], 


then Sq p)+S8(p, q) = PT. 
The proof may be stated in a geometrical form. In the figure (Fig. 7) 
AC and BC are « = p, y = q, and KM and LM are x = p’,y =g. 


+ See § 6.13 for a proof depending on Gauss’s law of reciprocity. 
f The notation has no connexion with that of § 5.6. 
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If (as in the figure) p >q, then q'/p’ < q/p, and M falls below the 
diagonal OC. Since MA 
g <= 7 <g +l, 


there is no integer between KM = q’ and KN = qp'/p. 

We count up, in two clifferent ways, the number of lattice points in 
the rectangle OKM L, counting the points on KM and LM but not 
those on the axes. In the first place, this number is plainly p’g’. But 
there are no lattice points on OC (since p and q are prime), and none 
in the triangle PMN except perhaps on PM. Hence the number of 
lattice points in OKML is the sum of those in the triangles OKN and 
OLP (counting those on KN and LP but not those on the axes). 

The number on ST, the line x = s, is [sq/p], since sq/p is the ordinate 
of T. Hence the number in OKN is 


s=1 


Similarly, the number in OLP is S(p, q), and the conclusion follows. 


6.13. Proof of the law of reciprocity. We can write 


(6.13.1) kq = A +p 


where l<k<y, l S uk <S p-l. 
Here u, is the least positive residue of kq (mod p). If u, = v, S p’, 
then u, is one of the minimal residues r, of § 6.11, while if u, = w, > p’ 
then u,—7 is one of the minimal residues —r;, Thus 

T; = Up, = P—Wk 
for every i, j, and some k. 


The r, and 7; are (as we saw in § 6.11) the numbers 1, 2,..., p’ in some 
order. Hence, if 


R= >= > vy R = $ r; = > (p—w,) = pp— } w 


(where u is, as in § 6.11, the number of the r}), we have 


fe ae lp—lp+1_ pel 
RLR = Sv 5 Po y 
and so 
(6.13.2) pp+ > Vp > Wy = 4(p?— 


On the other hand, eas (6.13.1) from k = 1 to k= p’, we have 
(6.13.3) $9(p = pS(qZp)t+ > m= PS P)+ Y H+ $ wy. 
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From (6.13.2) and (6.13.3) we deduce 


(6.13.4) $(p?—1)(q—1) - P8 p)+2 X w, —up. 


Now q- 1 is even, and p?— 1 =0 (mod 8);f so that the left-hand side 
of (6.13.4) is even, and also the second term on the right. Hence (since 


Acan S(q, p) = p (mod 2), 
and therefore, by Theorem 92, 


4\— (—1¥} = (-1 Sap), 
() (—1} = (-1) 


Finally, 9E} (—1)8ar+80.0 — (— 1)", 
Wl 
by Theorem 100. 
We now use the law of reciprocity to prove Theorem 97. If 
p = 10n+k, 
where kis 1, 3, 7, or 9, then (since 5 is of the form 4n+1) 


3-0-6-0 


The residues of 5 are 1 and 4. Hence 5 is a residue of primes 5n+ 1 


and 5n+4, i.e. of primes 10n+łand 10n+9, and a non-residue of the 
other odd primes. 


6.14. Tests for primality. We now prove two theorems which 
provide tests for the primality of numbers of certain special forms. 
Both are closely related to Fermat’s Theorem. 

THEOREM 101. [fp >2,h <p, n= hp+lor hp?+1 and 
(6.14.1) 2h Æ I, 2n-1 = 1 (modn), 
then n is prime. 

We write n= hp’+1, where b = 1 or 2, and suppose d to be the 
order of 2 (modn). After Theorem 88, it follows from (6.14.1) that q y h 


and d | (n-l), ie. d | hp?. Hence p |d. But, by Theorem 88 again, 
d ¢(m)and so p ¢(n). If 


n = pipi, 
we have b(n) = po.. p Hpi — 1)...(p,—1) 
and so, since p fn, p divides at least one of p,—1, Pol,- py—l. 
Hence n has a prime factor P = 1 (mod p), 


t Ifp = 2n+lthen p!—1l= 4n(n+1) = 0 (mod 8). 
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Let n = Pm. Since n =1 = P (modp), we have m = 1 (modp). 
Ifm > 1, then 
(6.14.2) n= (up+1(eptl), l<u<e 
and hp’! = uvp+u+v. 
If b = 1, this is h = uvp-++-u+v and so 
P<up<h<p, 
a contradiction. If b = 2, 
hp = wp+u+v, p|(u+r), utv>p 
and so 2v > u+ È p, v > 4p 
and 


uv<h<p w<p—2, u<? < Mera < 2, 


v B 
Hence u = 1 and so 
v2p—-l, UV 2 pl, 
a contradiction. Hence (6.14.2) is impossible and m = 1 and n =P. 


THEOREM 102. Let m >2, h< M and n = h2"+1 be a quadratic 
non-residue (modp) for some odd prime p. Then the necessary and suffi- 
cient condition for n to be a prime is that 


(6.14.3) pie-) = — 1 (mod n). 
First let us suppose n prime. Sinoe n = 1 (mod 4), we have 
E E 
@ P) 


by Theorem 99. Then (6.14.3) follows at once by Theorem 83. Hence 
the condition is necessary. 

Now let us suppose (6.14.3) true. Let P be any prime factor of n and 
let d be the order of p (mod P). We have 


ped = —1, p=], pē” = (mod P) 
and so, by Theorem 88, 
df alin—l1), d (n—1), d (P—1), 
that is d {2"-1h, d 2h, d (P—1), 
so that 2”|d and 2"| (P-D). Hence P = 2™%+1. 
Since n = 1 = P (mod 2”), we have n/P = 1 (mod 2”) and so 
n = (2%q+1)(2"y-+ 1), x>ly20. 
Hence Igy < Waey+uety = h< IW, y=0 
and n = P. The condition is therefore gufficient. 


80 FEEMAT’S THEOREM AND ITS CONSEQUENCES [Chap. VI 


If we put h = 1, m = 2*, we have n = F, in the notation of § 2.4. 
Since 1? = 22 = 1 (mod 3) and F, = 2 (mod3), F, is a non-residue 
(mod 3). Hence a necessary and sufficient condition that Fẹ, be prime 
is that F, (317-04 1). 


6.15. Factors of Mersenne numbers; a theorem of Euler. We 
return for the moment to the problem of Mersenne’s numbers, men- 
tioned in § 2.5. There is one simple criterion, due to Euler, for the 
factorability of M, = 2? = 1. 


THEOREM 103. [fk > 1 and p = 4k+8 ig prime, then a necessary 
and sufficient condition that 2p+ 1 should be prime is that 
(615.1) 2 = 1 (mod 2p+1). 
Thus, if 2p+1is prime, (2p+1) M, and M, is composite. 


First let us suppose that 2+ 1 = P is prime. By Theorem 95, since 
P = 7 (mod 8), 2 is a quadratic residue (mod P) and 


9p — 24P-1) — 1 (mod P) 


by Theorem 83. The condition (6.15.1) is therefore necessary and 
P|M,. But k >1land sop >3 and M, = 27—1> 2p+1=P. 
Hence M, is composite. 

Next, suppose that (6.15.1) is true. In Theorem 101, put h = 2, 
n= 2p+1. Clearly h < pand 2h = 4 Æ 1 (modn) and, by (6.15.1), 


Qn-l — 22 = 1 (modn). 
Hence n is prime and the condition (6.15.1) is sufficient. 


Theorem 103 contains the simplest criterion known for the charac- 
ter of Mersenne numbers. The first eight cases in which this test gives 
a factor of M,,,,, are 


23 | Mh, 47 | Mos, 167 | Mgs, 263 | Mis, 
359 | Miro 383 l Misi 479 Moz9; 503 Mos. 


a 


NOTES ON CHAPTER VI 


§ 6.1. Fermat stated his theorem in 1640 (Œuvres, ii. 209). Euler’s first proof 
dates from 1736, and his generalization from 1760. See Dickson, History, i, ch. iii, 
for full information. 

§ 65. Legendre imtroduced ‘Legendre’s symbo? in his Essai sur la théorie des 
nombres, first published in 1798. See, for example, § 135 of the second edition 
(1808). 
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§ 6.6. Wilson’s theorem was first publiahed by Waring, Meditationes algebraicae 
(1770), 288. There is evidence that it was known long before to Leibniz. Goldberg 
(Journ. London Math. Soc. 28 (1953), 252-6) gives the residue of (p-I)!+ 1 to 
modulus p? for p < 10000. See E. H. Pearson /Math. Computation 17 ( 1963), 
194-5] for the statement about the congruence (mod pê). : 

§ 6.9. Theorem 89 is due to Cipolla, Annali di Mat. (3), 9 (1903), 13960. 
Amongst others the following composite values of m satisfy (6.9.1) for all a 
prime to m, viz. 3.11.17, 5.13.17, 5.17.29, 5.29.73, 7.13.19. Apart from these, 


the composite values of m < 2000 for which 2”-1 = 1 (modm) are 
341 = 11.31, 645 = 3.5.43, 1387 = 19.73, 1905 =3.5.127, 


See also Dickson, History, i. 91-95, and Lehmer, Amer. Math. Monthly, 43 (1936), 
347-54. Lehmer gives a list of large composite m for which 2-1 = 1 (modm). 

Theorem 90 is due to Lucas, Amer. Journal of Math. 1 (1878), 302. It has 
been modified in various ways by D. H. Lehmer and others in order to obtain 
practicable tests for the prime or composite character of a given large m. See 
Lehmer, loc. cit., and Bulletin Amer. Math. Soc. 33 (1927), 327-40, and 34 (1928), 
54-56, and Duparc, Simon Stevin 29 (1952),.21-24. 

§ 6.10. The proof is that of Landau, Vorlesungen, iii. 275, improved by R. F. 
Whitehead. Theorem 91 is true also forp = 3511 (N. G. W. H. Beeger, Mess. Math. 
51 (1922), 149-50) and for no other p < 200000 (see Pearson, loc. cit., above). 

§§ 6.11-13. Theorem 95 was first proved by Euler, Theorem 98 was stated by 
Euler and Legendre, but the first satisfactory proofs were by Gauss. See Bachmann, 
Niedere Zahlentheorte, i, ch. 6, for the history of the subject, and many other proofs. 

$6.14. Taking the known prime 2!27_ 1 asp in Theorem 101, Miller and Wheeler 
tested n = hp+ 1 and n= hp? + 1 (with various small values of h) for prime 
factors < 400 and < 2000 respectively.. For exemple, trivially, if A is odd, 2\n. 
They then showed that 2h Æ 1 (mod n) for the remaining A (a fairly simple matter, 
since 2?— 1 is not large compared with n). Finally they used the Cambridge 
electronic computer to test whether gn- =] (mod n). For each n = hp+ 1, this 
took about 3 minutes, and for each n = hp? -+ 1 about 27 minutes. Several primes 
of form hp + 1 were found. Seven numbers of the form hp?+ 1 were found not 
to satisfy 2"-! = 1 (modn) (and so to be composite) before » = 180p?+1 was 
found to satisfy the test. See Miller, E ureka, October 1951; Miller and Wheeler, 
Nature, 168 (1951), 838; and our note to § 2.5. Theorem 101 is also true when 
n = hp'+ 1, provided that h < Vp and that h is not a cube. See Wright, Math. 
Gazette, 37 (1953), 104-6. 

Robinsonextended Theorem 102 (Amer. Math. Monthly, 64 (1957), 703-10) and 
he and Selfridge used the case p = 3 of the theorem to find a large number of 
primes of the form h, 2™-+ 1 (Math. tables and other aids to computation, 11 (1957), 
21-22). Amongst these primes are several factors of Fermat, aambers. See also 
the note to § 15.5. 

Lucas [Théorie des nombres, i (1891), p. xii] stated the test for the primality 
of F,,, Hurwitz [Math. Werke. ii. 747] gave a proof. Fio was proved composite by 
this test, though an actual factor was subsequently found (see Selfridge, Math. 
tables and other aids to computation, 7 (1953), 274-5). 

§ 6.15. Theorem 103; Euler, Comm. Acad. Petrop. 6 (1782-3), 103 [Opera (1), 
ii. 3]. 
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VII 
GENERAL PROPERTIES OF CONGRUENCES 


7.1. Roots of congruences. An integer x which satisfies the con- 
Eruence f(x) = ¢Cya™+ce,2"-1+....+¢, = 0 (modm) 
is said to be a root of the congruence or a root off(x) (modm). If a is 
such a root, then so is any number congruent to a (modm). Congruent 
roots are considlered equivalent; when we say that the congruence has 
l roots, we mean that it has | incongruent roots. 

An algebraic equation of degree n has (with appropriate conventions) 
just n roots, and a polynomial of degree n is the product of n linear 
factors. It is natural to inquire whether there are analogous theorems 
for congruences, and the consideration of a few examples shows at once 
that they cannot be so simple. Thus 


(7.1.1) ze-1_] =0 (mod p) 
has p-l roots, viz. 1,2 jay p-L 

by Theorem 71; 

(7.1.2) zi— 1 = 0 (mod 16) 
has 8 roots, viz. 1, 3, 5, 7, 9, 11, 13, 15; and 
(7.1.3) x4—2 = 0 (mod16) 


has no root. The possibilities are plainly much more complex than they 
are for an algebraic equation. 
7.2. Integral polynomials and identical congruences. If c), 
ci., Cp are integers then 
Cot" +e,a71+...+46, 


is called an integral polynomial. If 


n n 
f@)= Yea, ge) = $ cam, 
r=0 r=0 


and c, = ¢, (modm) for every r, then we say that f(x) and g(x) are 
congruent to modulus m, and write 
f(x) = g(x) (modm). 


Plainly f(x) = g(x) —> f(x)h(x) = g(x)h(x) 
if h(x) is any integral polynomial. 
In what follows we shall use the symbol ‘=’ in two different senses, 


the sense of § 5.2, in which it expresses a relation between numbers, 
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and the sense just defined, in which it expresses a relation between 
polynomials. There shoulcl be no confusion because, except in the 
phrase ‘the congruence f(x) = 0’, the variable x will occur only when 

the symbol is used in the second sense. When we assert that f (x) = g(x), 
or f (x) =0, we are using it in this sense, and there is no reference to 

any numerical value of x. But when we make an assertion about ‘the 
roots of the congruence f(x) = 0’, or discuss ‘the solution of the con- 
gruence’,} it is naturally the first sense which we have in mind. 

In the next section we introduce a similar double use of the symbol ‘ |’. 

THEOREM 104. (i) If p îs prime and 

f(x)g(x) = 0 (mod p), 
then either f(x) = 0 or g(x) = 0 (mod p). 
(ii) More generally, if 
f(g(x) = 0 (mod p*) 
and f(x) Æ o (mod p), 
then g(x) = 0 (mod p9). 

G) We form f,(z) from f(x) by rejecting all terms of f(x) whose 
coefficients are divisible by p, and g,(x) similarly. If f(x) Æ 0 and 
g(x) Æ 0, then the first coefficients in f,(x) and g,(x) are not divisible by 
p, and therefore thé first coefficient in f,(x)g,(x) is not divisible by p. 
Heoneg flx)g(x) = filen) Æ 0 (mod p). 

(ii) We may reject multiples of p from f (x), and multiples of p% from 
g(x), and the result follows in the ssme way. This part of the theorem 


will be required in Ch. VIII. 
If f(x) = g(x), then f(a) = g(a) for all values of a. The converse is not 


true; thus a? = a (modp) 


for all a, by Theorem 70, but 
x? = x (mod p) 
is false. 

7.3. Divisibility of polynomials (mod m). We say that f(x) is 
divisible by g(x) to modulus m if there is an integral polynomial h(x) 
such that f(x) = gla)h(x) (mod m). 

We then write g(x) If (x) (mod m). 
THEOREM 105. A necessary and sufficient condition that 
(x-u) | f(z) (mod m) 
is that fa) = 0 (modm). 
łe-g.in § 8.2. 
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If (x—a) | f(x) (mod m), 
then f(x) = (x—a)h(x) (modm) 
for some integral polynomial h(x), and so 

f(a) = 0 (modm). 
The condition ig therefore necessary. 


It is also sufficient. If 
f(a) = 0 (modm), 


then f(x) = f) —f(@) (mod m). 
But (aa oe 

and f(x)—f(a) = (x—a)h(z), 
where 


A(x) feka) _ > c (ar-r -14 gn—r—2q 4a) 


z—a 
is an integral polynomial. The degree of h(x) is one less than that of 
J2). 


7.4. Roots of congruences to a prime modulus. In what follows 
we suppose that the modulus m is prime; it is only in this case that there 
is a simple general theory. We write p for m. 


Torm 106. If p is prime and 
fix) = g(x)A(x) (mod p), 

then any root of f(x) (modp) is a root either of g(z) or of h(z). 

Ifa is any root off(x) (modp), then 

f(a) = 0 (mod p), 

or g(a)h(a) = 0 (modp). 
Hence g(a) = 0 (modp) or h(a) = 0 (modp), and So a is a root of g(x) 
or of h(x) (modp). 

The condition th.at the modulus is prime is essential. Thus 

x = x?—4 = (x-2)(2+2) (mod 4), 

and 4 is a root of x2 = 0 (mod4) but not of x-2 = 0 (mod4) or of 
x+4+2=0 (mod 4). 

THEOREM 107. If f(x) is of degree n, and has more than n roots (modp), 
then f(x) = 0 (modp). 


The theorem is significant only when n < p. It is true for n = 1, by 
Theorem 57; and we may therefore prove it by induction. 
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We assume then that the theorem is true for a polynomial of degree 
less than n. If f(x) is of degree n, and f(a) = 0 (modp), then 
fe) = (w—a)g(x) (mod p), 
by Theorem 105; and g(z) is at most of degree n- 1. By Theorem 106, 
any root of f(z)is either a or a root of g(x). If f(z) has more than n 
roots, then g(x) must have more than n- 1 roots, and so 
g(a) = 0 (mod p), 
from which it follows that 
f(x) = 0 (modp). 
The condition that the modulus is prime is again essential. Thus 
x*—1= 0 (mod 16) 
has 8 roots. 
The argument proves also 
Tuzorem 108. If f(x) has its full number of roots 
Qis Ayg,..., Ap (MOA P), 
then f(x) = ¢9(w—a,)(x—a,)...(e—a,,) (modp). 


7.5. Some applications of the general theorems. (1) Fermat’s 
theorem shows that the binomial congruence 
(7.5.1) xz? = 1 (modp) 
has its full number of roots when d= p- 1. We can now prove that 
this is true when d is any divisor of p 1. 

Tuzorem 109. If p is prime andd p- 1, then the congruence (7.5.1) 
hus d roots. 

We have aP-1— 1 = (x4— 1)g(2), 
where g(x) = gP-l-dyp-l-2d | gitl. 

Now ,?-!—] = 0 has p-l roots, and g(x) = O has at most p-l-d. 
It follows, by Theorem 106, that x4—1= 0 has at least d roots, and 
therefore exactly d. 

Of the d roots of (7.5.1), some will belong to d in the sense of § 6.8, but 
others (for example 1) to smaller divisors of p- 1. The number belong- 
ing to d is given by the next theorem. 

Tuzorem 110. Of the d roots of (7.5.1), 6(d) belong to d. In particular, 
there are $(p— 1) primitive roots of p. 

If#(d) is the number of roots belonging to d, then 


> ¢@) = P-1, 
dip—1 
since each of 1, 2,..., p- 1 belongs to some d; and also 
> (d) = p—l, 


dip—1 
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by Theorem 63. I:f we can show that (d) < 4(d), it will follow that 
(d) = ġ(d) for each d. 

If 4(d) > 0, then one at any rate of 1,2,...,p — 1, say f, belongs to d. 
We consider the d :numbers 

fr=f? O<h < ad. 

Each of these numbers is a root of (7.5.1), since f? = 1 implies f* = 1. 
They are incongruent (modp), since f * = f ¥, where A’ < h < d, would 
imply f k = 1, whereO <k=h-h’ < d, and then f would not belong 
to d; and therefore, by Theorem 109, they are all the roots of (7.5.1). 
Finally, if f, belongs to d, then (h, d) = 1; for k |k, k| d, and k> 1 
would imply (frye = (faye = 1, 
in which case f , would belong to a smaller index than d. Thus A must 
be one of the (d) numbers less than and prime to d, and therefore 


#(2) < 4(d). 
We have plainly proved incidentally 


THEOREM 111. If pis an odd prime, then there are numbers g such 
thelt 1, g, g?,..., g?-? are incongruent modp. 

(2) The polynomial f(v)=aP4-1 
is of degree p--l and, by Fermat’s theorem, has the p-l -roots 
1,2, 3,..,p-l (modp). Applying Theorem 108, we obtain 

THEOREM 112. Jf p is prime, then 
(4.5.2) -1—1 = (x—1)(x—2)...(e—p+1) (modp). 

If we compare the constant terms, we obtain a new proof of Wilson’s 
theorem. If we compare the coefficients of x?-*, xP-3,,.., x, we obtain 

THEOREM 113. If p is an odd prime, 1 < 7 < p- 1, and A, is the 
sum of the products of 1 different members of the set 1, 2,..., p-l, then 
A, = 0 (modp). 


We can use Theorem 112 to prove Theorem 76. We suppose p odd. 


Suppose thet n= rp—s (r >lo <6 < p). 
Then 
eames _ (rp—s+p—1)! _ (rp—s+1)(rp—s +2)...(rp—s+p—1) 
n ™ (rp—s)\(p—1)! — (p—1)! 


is an integer 7, and 
(rp—s+1)(rp—s+-2)...(rp—s+p—1) =(p—1)!4 = ~¢ (modp), 
by Wilson’s theorern (Theorem 80). But the left-hand side is congruent to 
(s— 1)(s—2)...(s—p-+- 1) = s?-1— 1 (modp), 


by Theorem 112, and is therefore congruent to — 1 when § = 0 and to O otherwise. 
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7.6. Lagrange’s proof of Fermat’s and Wilson’s theorems. We 
based our proof of Theorem 112 on Fermat’s theorem and on Theorem 
108. Lagrange, the discoverer of the theorem, proved it directly, and 
his argument contains another proof of Fermat’s theorem. 

We suppose p odd. Then 


(7.6.1) (w—1)(e—2)...(a—p+1) = 2? -1—A,2P*+...4A, 4, 
where A,,... are defined as in Theorem 113. If we multiply both sides 
by x and change x into x- 1, we have 


(w—1)?—A,(a—1)?-4-...4.4, (2-1) = (e—1)(w—2)...(a—p) 
= (c—p)(x? 1—A,2P-*+...4-A,_1). 


Equating coefficients, we obtain 
—l 
Gh = p+A,, a Sa 4 1 Jats = pA,+A,, 
p—-l\, =2 
her jas Pr jat parts 
and so on. The first equation is an identity; the others yield in suc- 
cession la 
AS 6) 2A, = oHe Ján 


=| —2 
ai= (Jar eE 


(p— Ay, z 14-4,+A,+...4Ap-2. 


Hence we deduce successively 


(7.6.2) plán pl Ag, pl Ap-a 
and finally (p—1)A,-1 = 1 (modp) 
or 

(7.6.3) Ap = -1 (modp). 


Since A,_, = (p-D!, (7.6.3) is Wilson’s theorem; and (7.6.2) and 
(7.6.3) together give Theorem 112. Finally, since 


(x—1)(x—2)...(xz—p+1) = 0 (modp) 
for any x which is not a multiple of p, Fermat’s theorem follows as a 
corollary. 
7.7. The residue of {3(p— 1))!. Suppose that p is an odd prime and 
w= 3(p—}). 
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(p—1)! = 1.2...3(p—l){p—4(p—lI{p—}(p—3)}...(p—1) 
=: (— 1)*(a!)? (modp) 
it follows, by Wilson’s theorem, that 
(a!)? = (- 1)7-! (modp). 

We must now distinguish the. two cases p = 4n-+ 1 and p= 4n-+ 3. 

If p= 4n-+-1, then (w!)? = -1 (modp), 


so that (as we proved otherwise in $6.6) —] is a quadratic residue of p. 


In this case m is congruent to one or other of the roots of x? = —1 
(mod p). 
If p = 4n+3, then 
(7.7.1) (ar!)? = 1 (modp), 
(7.7.2) w = +1 (modp). 


Since ~ 1 is a non-residue of p, the sign in (7.7.2) is positive or negative 
according as w! is a residue or non-residue of p. But w! is the product 
of the positive integers less than 4p, and therefore, by Theorem 85, the 
sign in (7.1.2) is positive or negative according as the number of non- 
residues of p less than $ is even or odd. 

THEOREM 114. If p is a prime 4n+3, then 


{3(p—1)}! = (—1)" (modp), 
where v is the number of quadratic non-residues less than 4p. 


7.8. A theorem of Wolstenholme. It follows from Theorem 113 
that the numerator of the fraction 


1 1 1 
L o ana ES 
is divisible by p; in fact the numerator is the Ags of that theorem. 
We can, however, go farther. 
THEOREM 115. Jf p tsa prime greater than 3, then the numerator of the 
fraction 
on ppg 
Dn p—l 
is divisible by p*. 
The result is false when p = 3. It is irrelevant whether the fraction 
is or is not reduced to its lowest terms, since in any case the denominator 
cannot be divisible by P. 


The theorem may be stated in a different form. If tis prime to m, 


the congruence ix = 1 (modm) 
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has just one root, which we call the associate of ¿ (mod m).+ We may 
denote this associate by 7, but it is often convenient, when it is plain 
that we are concerned with an integer, to use the notation 


g 
(or 1/i). More generally we may, in similar circumstances, use 


b 
a 
(or bja) for the solution of ax = b. 
We may then (as we shall see in a moment) state Wolstenholme’s 


theorem in the form 


THEorEM 116. If P > 3, and 1/i is the associate of i (mod p?), then 


1 
Rep a ged = 0 (mod p’). 


We may elucidate the notation by proving first that 
1 1 ~ 1l 
7.8.2 l1+--+-+...-+-— =0 (modp). 
(7.8.2) Tae ee (mod p).f 


For this, we have only to observe that, if 0 <i < p, then 


L nœ l 
Voi = -= 1 dp). 
bes (p ar (modp) 
1 
Hence i+ l ) = AA E = 0 (modp), 
i p—i a pi 


1 1 
oe See) dp), 
ta = 0 (modp). 


and the result follows by summation. 

We show next that the two forms of Wolstenholme’s theorem 
(Theorems 115 and 116) are equivalent. If 0 < x < p and ž is the 
associate of x (mod p®), then 


E(p—~1)! = az 2—1) = (p—-1) (mod p?). 


Hence 1 
(p—1)!743+4...4p—]) = (11 t5+—+555] (mod p?), 


the fractions on the right having their common interpretation; and the 
equivalence follows. 
+ As in §6.5, the a of §6.5 being now 1. 


t Here, naturally, Ife is the associate of 2 (mod p). This is determinate (mod p}, but 
indeterminste (mod p?) to the extent of an arbitrary multiple of p. 
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To prove the theorem itself we put % = p in the identity (7.6.1). 
This gives 
(p--L)! = pP-l!—A, p?-?+...—Ap_p pt Ay-1- 
But A,_, = (p-V!, and therefore 
pr-2— A, p?3+.,..4.4,-3p—Ap_s = 
Since p > 3 and ee as a ae 


by Theorem 113, it follows that p°? Ap -a i 


P| (p -1+3 +5+- a 
This is equivalent to Wolstenholme’s theorem. 
The numerator of l 1 
C, = l+ate to 
is A?_,—2A,_, Áp- and is therefore divisible by p. Hence 
Tueorem 117. If p > 3, then Cp = 0 (modp). 


7.9. The theorem of von Staudt. We conclude this chapter by 
proving a famous theorem of von Staudt concerning Bernoulli's numbers. 
Bernoulli’s numbers are ae defined as the coefficients in the 
expansion { B, 
F EPEE ae u =, 


e-l] 7 
We shall find it convenient to write 
eo ot Ste + Boge Baas 4, 
so that By = 1, Bı = —4 and 
Box = (DFB Bor = 9 (k 21). 


The importance of the numbers cornes primarily from their occurrence 
in the ‘Kuler—Maclaurin sum-formula’ for $ m*. In fact 


= E 1 k k+1-r 
(7.9.1) TEAL 4 (n—1)k = > rh) +178 


for k > 1. For the left-hand side is the coefficient of ¢*+1 in 


kla(Itet te +...f eee) = e = kl- 


and (7.9.1) follows by picking out the coefficient in this product. 


+ This expansion is convergent whenever jæ] < 2r. 


= (e1) 
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von Staudt’s theorem determines the fractional part of B,. 
THEOREM 118. Jf k > 1, then 


7.9.2 —1¥B, = 5 1 
0a (0, =} ody, 
the summation being extended over thc: primes p such that (p- 1) | 2k. 


For example, if k = 1, then (p—1)| 2, which is true if p = 2 or p = 3. 
Hence — B, = 4+4 = 3; and in fact B, = 3. When we restate (7.9.2) 
in terms of the £, it becomes 


(7.9.3) B+ > Luk 
-iik P 

where 

(7.9.4) k= 1,2, 4, 6 jus 


and 4 is an integer. Ifwedefine @ kíp) by 


e(p)-1¢(P-1) k) |p) - 9 ((p—1)/ k), 
then (7.9.3) takes the form 


(7.9.5) Pe+ >, seiz =i, 


where p now runs through all primes. 
In particular von Staudt’s theorem shows that there is no squared 
factor in the denominator of any Bernoullian number. 


7.10. Proof of von Staudt’s theorem. The proof of Theorem 118 
depends upon the following lemma. 


l 
Txzorem 119: "$ mk = —e,(p) (modp). 

1 
If (p-l) | k, then m* = 1, by Fer:mat’s theorem, and 


$ mt = p—l = -1 = —e,(p) (modp). 
If (p- 1) f k, and g is a primitive root of p, then 
(7.10.1) g* Æ 1(modp), 


by Theorem 88. The sets g, 2g ,..., (p—l)g and 1, 2 ,..., p-l are equi- 
valent (modp), and therefore 


> (mg)* = X m* (mod p), 

(g*—1) ¥ m* = 0 (modp), 
and > m* = 0 = —e,(p) (mod p), 
by (7.10.1). Thus X m* = —e,(p) in any case. 
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We now prose Theorem 118 by induction, assuming that it is true for 
any number lof the sequence (7.9.4) less than k, and deducing that it 
is true for £, In what follows ķ and Į belong to (7.9.4), r runs from 
0 to k, By = 1. and f} = f; = ...= 0. We have already verified the 
theorem when ķ == 2, and we may suppose k > 2. 

It follows from (7.9.1) and Theorem 119 that, if wis any prime 


k 
etw) + 2 re = 0 (mod w 
f= 


or 


(7.10.2) By pE Sf wk-1- (a8) = 0 (mod 1); 
w k+1—r\r (p) ); 
r=0 f 


there is no term in $p, since Bp- = 0. We consider whether the 
denominator of d 
Up eae k ok-1( a8) 
t= ktl—ror r 
can be divisible by w. 

Tf r is not an l, 8, is 1 or 0. If y is an J, then, by the induetive hypo- 
thesis, the denominator of §, has no squared factor,f and that of w£, 
is not divisible by w. The factor i is integral. Hence the denomina- 

or 


tor of Uy, is divisible by w only if that of 


wk-1-r ol 
k+1—r s+1 
is divisible by y, In this case 
stl. 
But s = k-r > 2, and therefore 
s+1< 8 <m; 


a contradiction. Jt follows that the denominator of u,, is not divisible 
by w. 


Hence pg ai) — te 
T by 
where w / b,; and ex(P) p $ w) 
P 
is obviously of the same form. It follows that 
A 
(7.10.3) Eya aie 
fr > P By 


t It will be observed that we do not need the full force of the inductive hypothosis. 
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where B, is not divisible by w. Sinoe w is an arbitrary prime, B, must 
be 1. Hence the right-hand side of (7.10.3) is an integer ; and this proves 
the theorem. 

Suppose in particular that ķ is a prime of the form 3n+- 1. Then 
(p-l) 2k only if p is one of 2, 3, k+1, 2k+1. But k+] is even, and 
2k-+-1 = 6n+3 is divisible by 3, so that 2 and 3 are the only permissible 
values of p. Hence 


Tuzorem 120. If kisa prime of the form 3n-+ 1, then 
B, =} (mod 1). 


The argument can be developed to prove that if ķ is given, thero are 
an infinity of 1 for which B, has the same fractional part as B}; but for 
this we need Dirichlet’s Theorem 15 (or the special case of the théorem 


in which b = 1). 


NOTES ON CHAPTER VII 


§§ 7.2-4. For the most part we follow Hecke, § 3. 

§ 7.6. Lagrange, Nouveaux mémoires de l’ Académie royale de Berlin, 2 (1773), 
125 (Œuvres, iii. 425). This was the first published proof of Wilson’s theorem. 

§ 7.7. Dirichlet, Journal fiir Math. 3 (1828), 407-8 (Werke, i. 107-8). 

§ 7.8. Wolstenholme, ‘Quarterly Journal of Math. 5 (1862), 35-39. There are 
many generalizations of Theorem 115, some of which are also generalizations of 
Theorem 113. See § 8.7. 

The theorem has generally been described as “Wolstenholme’s theorem’, and 
we follow the usual practice. But N. Rama Rao /Bull. Calcutta Math. Soc. 29 
(1938), 167-70] has pointed out that it, and a good many of its extensions, had 
been anticipated by Waring, Meditationes algebraicae, ed. 2 (1782), 383. 

§§ 7.9-10. von Staudt, Journal fiir Math. 21 (1840), 372-4. The theorem was 
discovered independently by Clausen, Astronomische Nachrichten, 17 (1840), 352. 
We follow a proof by R. Rado, Journal London Math. Soc. 9 (1934), 85-8. 

Theorem 120, and the more general theorem referred to in connexion with it, 
are due to Rado (ibid. 88-90). 


VIII 
CONGRUENCES TO COMPOSITE MODULI 


8.1. Linear congruences. We have supposed since § 7.4 (apart 
from a momentary digression in § 7.8) that the modulus m is prime. 
In this chapter we prove a few theorems concerning congruences to 
general moduli. The theory is much less simple when the modulus is 
composite, and we shall not attempt any systematic discussion. 

We considered the general linear congruence 
(8.1.1) ax = b (modm) 


in § 5.4, and it will be convenient to recall our results. The congruence 
is insoluble unless 
(8.1.2) d= (a,m) | b. 
If this condition is satisfied, then (8.1.1) has just d solutions, viz. 
m 


m m 
É, ETT ae étal) 


where é is the unique solution of 


We consider :next a system 
(8.1.3) ajx = b, (modm,), a,x = b, (mod m,),..., ape = by (mod my). 
of linear congruences to coprime moduli My, Mos., Mge The system will 
be insoluble unless (a;, m;) 6, for every t. If this condition is satisfied, 


we can solve each congruence separately, and the problem is reduced 
to that of the solution of a certain number of systems 


(8.1.4) x = c, (modm,), = C (modm,), ... # = Cy (mod my). 
The m, here are not the same as in (8.1.3); in’fact the m, of (8.1.4) is 
m,/(a@;, m,) in the notation of (8.1.3). 
We write 
m = M,Mz...m, = Mm M = m M, =... = m, My 
Since (m,, M,;) =], there is an n; (unique to modulus m,) such that 
n,M,; = 1 (mod m,). 
If 
(8.1.5) X = n, M citn M, cot... +n, My cx, 
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then x = n; M; c; = c;(modm,) for every i, so that x satisfies (8.1.4). 
If y satisfies (8.1.4), then 
y =c, =x (modm,) 

for every i, and therefore (since the m; are coprime), y = x (modm). 
Hence the solution x is unique (mod m). 

THEOREM 121. If my, Mo, n, Mp are coprime, then the system (8.1.4) 
has a unique solution (modm) given by (8.15). 

The problem is more complicated when the moduli are not coprime. We 
content ourselves with an illustration. 


Six professors begin courses of lectures on Monday, Tuesday, Wedneaday, 
Thursday, Friday, and Saturday, and announce their intentions of lecturing at 
intervals of two, three, four, one, six, and five days respectively. The regulations 
of the university forbid Sunday lectures (so that a Sunday lecture must be omitted). 
When jirst will all six professors jind themselves compelled to omit a lecture ? 

If the day in question is the xth (counting from and including the first 
Monday), then 

x = 142k, = 243k, = 344k, = 44h, = 546k, = 645k, = Thy, 

where the k are integers; i.e. 

(1) x =1(mod2), (2) x =2 (mod3), (3) x =3(mod4), (4) x = 4 (mod 1), 

(5) x =5 (mod6), (6) x =6 (mod), (7) x = O (mod7). 

Of these congruences, (4) is no restriction, and (1) and (2) are included in (3) 
and (5). Of the two latter, (3) shows that x is congruent to 3, 7, or 11 (mod 12), 
and (5) that x is congruent to 5 or 11, so that (3) and (5) together are equivalent 
to x = 11 (mod 12). Hence the problem js that of solving 


x = 11 (mod 12), x = 6 (mod 5), x = 0 (mod 7) 
or x = -1l (modl12), x =1(mod5), x = 0 (mod 7). 
This is a case of the problem solved by Theorem 121. Here 
m, = 12, M, = 5, Ms = 7, m = 420, 


M, = 35, M, = 84 M, = 60. 
The n are given by 
35n, = | (mod 12), 84n, = 1 (mod 5), 60n, = 1 (mod 7), 
or —n = 1 (mod 12), —n, = | (mod 5), 4n, = 1 (mod7); 
and we can take n, == l, na = — 1, na == 2. Hence 
x = (—1)(—1)854(—1)1.844+2.0.60 = -49 = 371 (mod420). 
“The first x satisfying the condition is 371.. 


8.2. Congruences of higher degree. We can now reduce the 
solution of the general congruencet 
(8.2.1) f(x) = 0 (modm), 
where f (x) is any integral polynomial, to that of a number of congruences 


whose moduli are powers of primes. 
+ See §7.2. 


96 CONGRUENCES TO COMPOSITE MODULI [Chap. VIII 


Suppose that M= MiM... Mp, 
no two m, having a common factor. Every solution of (8.2.1) satisfies 
(8.2.2) fx) =0 (modm,) (¢=1,2,...,k). 
If cy, €g,..., Cy 18 a Set of solutions of (8.2.2), and x is the solution of 
(8.2.3) t=c,(modm,) (i= 1, 2,..,4), 
given by Theorem 121, then 

f(x) = f(¢;) =0 (mod m,) 

and therefore f(x) = 0 (modm). Thus every set of solutions of (8.2.2) 
gives a solution of (8.2.1), and conversely. In particular 


THEOREM 122. The number of roots of (8.2.1) is the product of the 
numbers of roots of the separate congruences (8.2.2). 


If m = pf pf... pet, we may take m; = př. 

8.3. Congruences to a prime-power modulus. We have now 
to consider the congruence 
(8.3.1) f(z) =0 (mod p*), 


where pis prime anda > 1. 
Suppose first that x is a root of (8.3.1) for which 


(8.3.2) O<x < pt 

Then æ satisfies 

(8.3.3) fx) = 0 (mod p*}), 
and is of the form 

(8.3.4) E+sp™1 (0<8 <p), 
where £ is a root of (8.3.3) for which 

(8.3.5) O<é< pt, 


Next, if € is a root of (8.3.3) satisfying (8.3.5), then 
f(E+spt) = fE)+spef (6) H p (e+. 
= f(E)+sp° f (£) (mod p°), 
since 2a—2 > a, 3a—3 © a,..., and the coefficients in 


NAAC) 
k! 
are integers. We have now to distinguish two cases. 
(1) Suppose that 
(8.3.6) f (E) 4 0 (mod p). 
Then ¿+sp°-! is a root of (8.3.1) if and only if 


FETSE) = 0 (mod p°) 
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or sf'(€) = i (modp), 


and there is just one s (modp) satisfying this condition. Hence the 

number of roots of (8.3.3) is the same as the number of roots of (8.3.1). 
(2) Suppose that 

(8.3.7) J'E) = 0 (modp). 

Then f(€+sp*) = fE) (mod p°). 

If f(E) #0 (mod p*), then (8.3.1) is insoluble. If f(é) =0 (mod p%), 

then (8.3.4) is a solution of (8.3.1) for every s, and there are p solutions 

of (8.3.1) corresponding to every solution of (8.3.3). 


Tuzorem 123. The number of solutions of (8.3.1) corresponding to a 
solution £ of (8.3.3) is 


none, if f'(€) = 0 (modp) and € is not a solution of (8.3.1); 
) one, n é) Æ 0 (mod p); 
P, if F'(E) = 0 (modp) and £ is a solution of (8.3.1). 


The solutions of (8.3.1) corresponding to € may be derived from é, in case 
(b) by the solution of a linear congruence, in case (c) by adding any multiple 
Of pe- to £. 

8.4. Examples. (1) The congruence 

f(x) = a?-1—1 = 0 (modp) 
has the p-l roots 1, 2,..., p-l; and if € is any one of these, then 
F(E) = (p—1)& * = 0 (modp). 

Hence f (x) =0 (mod p?) has just p- 1 roots. Repeating the argument, 
we obtain 

Torm 124. The congruence 

aP-1_] = 0 (mod p?) 

has just p-— 1 roots for every a. 

(2) We consider next the congruence 
(8.4.1) fz) = xi?®-Y)_1]1 =0 (mod p?), 
where p is an odd prime. Here 

f'(€) = 4p(p—1)ére-d-1 = 0 (modp) 

for every é. Hence there are p roots of (8.4.1) corresponding to every 
root off(x) = 0 (modp). 

Now, by Theorem 83, 


x2) = +1 (modp) 
5591 H 
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according as x is a quadratic residue or non-residue of p, and 

gipe-l) = +] (modp) 
in. the same cases. Hence there are }(p—1) roots of f(z) = 0 (modp), 
and $p(p— 1) of (8.4.1). 

We define the quadratic residues and non-residues of p? as we defined 
those of p in § 6.5. We consider only numbers prime to p. We say that 
x is a residue of p? if (i) (x,p) = 1 and (ii) there is a y for which 

y? =x (mod p?), 
and a non-residue if (i) (x,p) = 1 and (ii) there is no such y. 
If x is a quadratic residue of p?, then, by Theorem 72, 
giv(p-l) = yPe-l) =) (mod p?), 
so that x is one of the 4p(p—1) roots of (8.4.1). On the other hand, 
if y, and y, are two of the »(p— 1) numbers less than and prime to p’, 
and y? = y, then either y, = p*—y,or y,—y, and y,+¥, are both 
divisible by p, which is impossible because y, and y, are not divisible 
by p. Hence the numbers y? give just }p(p—1) incongruent residues 
(mod p?), and there are 4p(p— 1) quadratic residues of p?, namely the 
roots of (8.4.1). 

TueoreM 125. There are 4p(p—1) quadratic residues of p*, and these 
residues are the roots of (8.4.1). 

(3) We consider finally the congruence 
(8.4.2) f(z) = z?—c = 0 (mod p%), 
where p fc. If p is odd, then 

f(5) = 2§ # 0 (mod p) 
for any £ not divisible by p. Hence the number of roots of (8.4.2) is the 
same as that of the similar congruences to moduli p*-}, p*-?,..., p; that 
is to say, two or none, according as cis or is not a quadratic residue of p. 
We could use this argument as a substitute for the last paragraph of (2). 

The situation is a little more complex when p = 2, since then 
f(f) = 0 (mod p) for every é. We leave it to the reader to show that 
there are two roots or none when a = 2 and four or none when a = 3. 


8.5. Bauer’s identical congruence. We denote by t one of the 
$m) numbers less than and prime to m, by ¢(m) the set of such numbers, 


and by 

(8.5.1) Fin(X) =H (x—t) 

a product extended over all the ¢ of ¢(m). Lagrange’s Theorem 112 
states that 


(8.5.2) fn(2) = 2#™— 1 (modm) 
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when m is prime. Since 
a2o™— 1 = 0 (modm) 
has always the d(m) roots t, we might expect (8.5.2) to be true for all m; 
but this is false. Thus, when m = 9, ¢ has the 6 values +1, +2, +4 
(mod 9), and 
fanla) = (a2 — 1°) (a? — 2?) (a? — 4?) = x6 3x44 32?— 1 (mod 9). 

The correct generalization was found comparatively recently by 
Bauer, and is contained in the two theorems which follow. 

THEOREM 126. If p is an odd prime divisor of m, and p* ig the highest 
power of p which divides m, then 
(8.5.3) fa == JE -0 = (2? -1—1)8-) (mod ph). 

im) 
In particular 
(8.5.4) fole) = T] -0 = (x? -t— 1) (mod p*). 
t(p°) 

THEOREM 127. If m is even, m `> 2, and 2° is the highest power of 2 

which divides m, then 


(8.5.5) fala) = (22— 1)#4™ (mod 24), 
In particular 

(8.5.6) falx) = (x2?— 1)?*"* (mod 22) 
when a > 1. 


In the trivial case m = 2, f(x) = x— l. This falls under (8.5.3) and not under 
(8.5.5). 
We suppose first that p > 2, and begin by proving (8.5.4). This is 
true when a = 1. If a > 1, the numbers in é(p*) are the numbers 
t+vp*! (0<v< p), 
where tis a number included in ¢(p*-!), Hence 


Fip(2) = TI foe). 


But fp —vp™!) = fylt) vp Yfye-s() (mod p”); 
and So(%) = {fp — È v. pH fpe-a(x)}? “Uf pe (2) 
= {fps-1(x)}? (mod p”), 
since > v = ġp(p— 1.) = 0 (modp). 
This proves (8.5.4) by induction. 

Suppose now that m = p*M and that p f M. Let ¢ run through the 
¢(p*) numbers of t(p*) and T through the ¢(M) numbers of t(M). By 
Theorem 61, the resulting set of ¢(m) numbers 

M+Tp%, 
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reduced modm, is just the set t(m). Hence 


= -t) = —iM—Tp* ; 
(ar) H (z-t) mie aio (x ‘p*) (modm) 


For any fixed T, since (p°, M) = 1, 
JI (@—tM—Tp*)= JT (@—-tM)= J] (&—t) = fox) (mod p’). 


tet( p") tet(p*) tet(p%) 
Hence, since there are ¢(//) members of t(M), 


fn(et) = (2P-1— 1)?"*44 (mod pt) 
by (85.4). But (8.5.3) follows at once, since 
ia m 
pigan = SPP guy. SO. 
8.6. Bauer’s congruence : the case p = 2. We have now to con- 
sider the case p = 2. We begin by proving (8.5.6). 
Ifa = 2, fax) = (x—1)(2—3) = x?—1 (mod 4), 
which is (8.5.6). When a > 2, we proceed by induction. If 
foa-1(z) = (x?—1)?"* (mod 22-1), 
then foa-1(%) = 0 (mod2). 
a fala) = fale) fae-ale—2"-2) 
= {for-s(e)}?— 29 foa- (2)f (2) 
= {foo-1(x)}? = (v?— 1)?" (mod 22), 
Passing to the proof of (8.5.5), we have now to distinguish two cases. 
(1) If m= 2M, where M is odd, then 
f(a) = (e—1)8™ = (x2 1) (mod 2), 


because (x- 1)? = x? 1 (mod 2). 
(2) If m= 24M, where M is odd and a > 1, we argue as in § 8.5, 
but use (8.5.6) instead of (8.5.4). The set of oim) = 924-1 (M) numbers 


tM+ T 2%, 
reduced modm, is just the set (m). Hence 
a eee {x— —. DA 
Tat ) Les?) ae IL ( —tM — 2 T) (mod m) 


= {fro(x)}? (mod 2), 
just as in § 8.5. (8.55) follows at once from this and (8.5.6). 


8.7. A theorem of Leudesdorf. We can use Bauers theorem to 
obtain a comprehensive generalization of Wolstenholme’s Theorem 115. 
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THEOREM 128. If Sm ao 
then 
(8.7.1) Sp = 0 (mod m?) 
if 2}{m, 3} m; 
(8.7.2) Sm = 0 (mod 4m?) 
if 2 fm, 3|m; 
(8.7.3) Sn = 0 (mod 4m?) 
if 2|m, 3} m, and m is not a power of 2; 
(8.7.4) Sna = 0 (mod 4m?) 
if 2|m, 3|m; and 
(8.7.5) Sn = 0 (mod 4m?) 
ifm = 24, 


We use J, JJ for sums or produ.cts over the range t(m), and 5”, 
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for sums or products over the part of the range in which tis less than 


4m; and we suppose that m = p%g*r.... 
If p > 2 then, by Theorem 126, 


(8.7.6) (w?-2— 1) mt = J] œt) = Jp (xt(x-m+t)} 
= IT’ {+ t(m—2)} (mod p9). 


We compare the coefficients of x? on the two sides of (8.7.6). If p > 3, 


the coefficient on the left is 0, and 


(8.7.7) 0 =T tmn- > maA ert 


Hence 


SSS Ge) 


l 
=tim{TI > EEA =.0 (mod p?4), 


or 


(8.7.8) Sn = 0 (mod p?4). 


If 2 { m, 3 fm, and we apply (8.7.8) to every prime factor of m, we 


obtain (8.7.1). 
If p = 3, then (8.7.7) must be replaced by 


(—1)##™)-144(m) =} [[t È go 
so that Sn [I t =(— 1-1 pmd(m) B 322), 
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Since (m) is even, and divisible by 34-1, this gives 


Sn = 0 (mod 324-1), 
Hence we obtain (8.7.2). 
If p = 2, then, by Theorem 127, 


(x2— 1i = T]’ {2?-+4(m—t)} (mod 2) 
and so (=y hót = ETT D a 
S, [Lt = 43mJ]t > im an (— 1)##0m)-1 dd (m) (mod 224), 


If m = 24M, where M is odd greater than 1, then 


(m) = 2¢-*6(M 
is divisible by 2%-1, and 
Sn = 0 (mod 2?4-1), 
This, with the preceding results, gives (8.7.3) and (8.7.4). 
Finally, if m == 2%, łġ(m) = 2¢-*, and 


Sn = 0 (mod 224-2), 
This is (8.7.5). 


8.8. Further consequences of Bauer's theorem. (1) Suppose that 


pim) 


m>2, m = T] p°, u, = 4d(m), u 1 (P > 2). 


Then ¢(m)is even and, when we equate the constant terme in (8.5.3) 
and (8.5.5), we obtain 


H t = (—1)* (mod p9). 


It is easily verified that the numbers u, and w, are all even, except 
when m is of one of the special forms 4, p%, or 2p°; so that [[ ¢= 1 
(mod m) except in these cases. If m = 4, then TI: =1.3 = 1(mod 4). 
If m is p* or 2p%, then w, is odd, so that Į] t=- 1 (mod p°) and there- 
fore (since Į J ¢ is odd) [J ¢ = -1 (modm). 


THEOREM 129: [ye = +1 (modm), 
um 


where the negative sign is to be chosen when m is 4, p%, or 2p%, where p 
is an odd prime, and the positive sign in all other cases. 

The case m = p is Wilson’8 theorem. 

(2) If p >2 and 


2) = JI (1—4) = 2A, fA... 


tp") 
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then f(z) = f(p*—a). Hence 
2A, eh I14 2A c6P)34,,, o f(—a)—f(x) - f(pt+x)—f (x) 
= p° f'(x) (mod p**). 
But pf (x) = p- p— 1ar-{xr-1— 1)" (mod p2) 
by Theorem 126. It follows that A,,,, is a multiple of p?*except when 
¢(p*)—2v—1 = p-2 (modp-1), 
i.e, when 2y = 0 (modp-1). 


THEOREM 130. Jf Az, is the sum of the homogeneous products, 2p-+ 1 
ut a time, of the numbers of t(p*), and 2v is not a multiple of p-l, then 


A a41 = 0 (mod p**). 
Wolstenholme’s theorem is the case 
a= 1, 2+1 = p-2, p > 3. 


(3) There are also interesting theorems concerning the sums 


1 
Sws = > pra’ 


We confine ourselves for simplicity to the case a = 1, m = p,t and 
suppose p > 2. Then f (x) = f (p-x) and 


f(—a) =f (pt+a=f (e)+pf'@) 
fx = -f = —f ‘(x)-pf “(x), 
fS A ot (—2) = pi S (a) f"(@)} 
to modulus p?. Since f (x) = 1 (modp), 
f “00-f (x) f “(x) = 22?-3—a?P-4 (modp) 


and so 
(8.8.1) f(x) f"(—2)-Hf (e)f (—2) = p(22”-*—2”-4) (mod p?). 
Now ro = > na = —8,—28,—28, —....t 
fe) f (a) +f (—2)F'@) _ ng ag 
(8.8.2) chase = —28,—2278,—.... 


+ In this case Theorem 112 is sufficient l’or our purpose, and we do not require the 
general form of Bauer's theorem. 
t The series which follow are ordinary power series in the variable x, 
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Also 
fix) = TJ (e—t) = JI (t-2) = ee yur eG J 
1 1 ba , b,x? ‘ 

7a 7 (1454 =} 

1 o1 Ga; Oya" 
(8.8.3) Force = a+ rate 
where uo == (p-l)! and the a, b, and ¢ are integers. It follows from 
(8.8.1), (8.8.2), and (8.8.3) that 

-3__ »2p—4 2 2 
25, 2085, —... EE je ) 
wT 


where g(x) is an integral polynomial. Hence, if 2v < p-3, the numera- 
tor of Sg, is divisible by p?. 


THEOREM 131.. If p is prime, 2v < p-3, and 


1 
Sy = lF wate TGP’ 


then the numerator of is divisible by pè. 


The case y = 0 is Wolstenholme’s theorem. When y = 1, p must be 


Sovy 


greater than 5. The numerator of 


l+statz 


is divisible by 5 but not by 52, 
There are many more elaborate theorems of the same character. 


8.9. The residues of 2°-! and (p-l)! to modulus p?. Fermat’s 
and Wilson’s theorems show that 2?-! and (p-1)! have the residues 
1 and -1 (modp). Little is known about their residues (mod p?), but 
they can be transformed in interesting ways. 


THEOREM 132. If p is an odd prime, then 


2Qe-1_] 1 1 1 
(8.9.1) P ala gg Otte (mod p). 


In other words, the residue of 27-1 (mod p?) is 


1 
14a ttt tony 5) 


where the fractions indicate associates (modp). 
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We have 


MODULI 
p p S p 
2P = P — sE? = 2 . 
(141) 1+(}) $ +(7] m a () 
Every term on the right, except the first, is divisible by p,t and 
al ——j px, 
where 


l! x, = p- 1)(p—2)...(p—l4+- 1) = (— 1¥-1(1— 1)! Gmodp), 
or lx, = (— 1)! (modp). 
Hence 


z =(- pag (modp), 


() = pr, = (— Wp; (mod; 


2—2 Z 1 1 
(8.9.2) Ea = t Z= las Z — ... ——— (modp) 
But 
1 l l1 1 1 l 1l l 
E 4g ty 


1 1 
(1 +3 Fee +t] (mod p), 
by Theorem 116,łf SO that (8.9.2) is equivalent to (8.9.1). 
Alternatively, after Theorem 116, the residue in (8.9.1) is 


p—l 
Teorem 133. If p is an odd prime, then 


(-1)! =(— a af B= ) (mod p?). 
Let p = 2n+1. Then 
2n)! 
on) = 1 .8...(2n—1) = (p—2)(p—4)...(p—2n), 


2n)! 1 1 1 
(—1y Om)! = Bal 2nt p3 +i ++] (mod p?) 


gny! = 
2°n!+ 2%n!(22"—1) (mod p?), 
by Theorems 116 and 132; and 


(2n)! = (-1)” 24"(n!)? (mod p?). 
+ By Theorem 75. 


t We need only (7.8.2). 
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NOTES ON CHAPTER VIII 


§ 8.1. Theorem 121 (Gauss, D.A., § 36) was known to the Chinese mathemati- 
cian Sun-Tsu in the first century a.». See Bachmann, Niedere Zahlentheorie, i. 83. 

§ 8.5. Bauer, Nouvelles annales (4), 2 (1902), 25664. Rear-Admiral C. R. 
Darlington suggested the method by which 1 deduce (8.5.3) from (8.5.4). This is 
much simpler than that used in earlier editions, which was given by Hardy and 
Wright, Journal London Math. Soc. 9 (1934), 38-41 and 240. 

Dr. Wylie points oyt to us that (8.5.5) is equivalent to (8.5.3), with 2 for p, 
except when m is a power of 2, since it may easily be verified that 

(w?— Lym) = (@— 1)9™) (mod 20) 
when m= 24M, M is odd, and M > 1. 

§ 8.7. Leudesdorf, Proc. London Math. Soc. (1) 20 (1889), 199-212. See also 
S. Chowla, Journal London Math. Soc. 9 (1934), 246; N. Rama Rao, ibid. 12 
(1937), 247-50; and E. Jacobstal, Forhand, K. Norske Vidensk. Selskab, 22 (1949), 
nos. 12, 13, 41. 


§ 8.8. Theorem, 129 (Gauss, D.A., $ 78) is sometimes called the ‘generalized 
Wilson’s theorem’. 

Man y theorems of the type of Theorems 130 and 13 1 will be found in Leudesdorf’s 
paper quoted above, and in papers by Glaisher in vols. 3 1 and 32 of the Quarterly 
Journal of Mathematics. 

§ 8.9. Theorem 132 is due to Eisenstein (1850). Full references to later proofs 


and generalizations wil] be found in Dickson, History, i, ch. iv. See also the note 
to § 6.6. 


IX. 
THE REPRESENTATION OF NUMBERS BY DECIMALS 
9.1. The decimal associated with a given number. There is a 


process for expressing any positive number € as a ‘decimal’ which is 
familiar in elementary arithmetic. 


We write 

(9.1.1) é = [€J+2= X+2, 

where X is an integer and 0 <x <1,+ and consider X and x separately. 
If X > 0 and 10 <x < 10%, 


and A, and X, are the quotient and remainder when X is divided by 


ie sthen X =A, 14X, 
where 0 < A, = [10X] < 10, 0 < X, < 10. 
Similarly 


X, =4,. 10-14.X, (0 <A, < 10,0 < X, < 10°), 
X, = A,. 1024+X, (0 <A, < 10,0 < X3< a. 


X., = A,. 10+X, (0 <A, < 10,0 < <X, < 10), 
X, = Á 0 < A, < 10). 
Thus X may be expressed uniquely in the form 
(9.1.2) X = A, 1084+ A,. 10%-!+-...4-A,. 1044; 


where every A is one of 0, 1, 2 ,..., 9, and A, is not 0. We abbreviate 
this expression to 


(9.1.3) X=A, Ay,...A, Ags, 


the ordinary representation of X in decimal notation. 
Passing to x, we write 


He Kh (0 <fi < 1). 
We suppose that a, = [10f,], so that 


a, is one of 0, 1,..., 9, and 
a = [10f,], 10f, = atf (0 < fe <1). 


t Thus [£] has the same meaning agin § 6.11. 
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Similarly, we define a, d,,.,. by 
= [10f,], 10f = agt+f, (0 <f <1), 
= [10], 10f; - Astfa (0 < fg <1), 


Every a, is one of 0, 1, 2,..,, 9. Thus 


(9.1.4) w= Lat In+y 
where 
ay 
(9.1.5) Xn = 76 34 pet +28, 
P 1 
(9.1.6) O S Iny = Tor © Jor’ 


We thus define a decimal -a, Qg 43...0, 


associated with x. We call a,, a,,... the first, second,... digits of the 
decimal. 
Since a, < 10, the series 


ice) a” 
(9.1.7) > sl 
10” 
I 
is convergent; and since g,,,, > 9, its sum is x. We may therefore write 
(9.1.8) X = ‘Ajag Q3..., 


the right-hand side being an abbreviation for the series (9.1 .7). 
If fa41 = 0 for some n, i.e. if 10”x is an integer, then 


anı = Any TT.. 50. 
In this case we say that the decimal terminates. Thus 
17 
= +0425000..., 
400 
5 . 17 
and we write simply 00 = = -0425. 


It is plain that the decimal for x will terminate if and only if x is a 
rational fraction whose denominator is of the form 2¢58, 


l 
> 1 
Since atl + st: = Inu < 10" 
9 9 9 1 
and ee eee ee eee = ; 
Atot = Taga) = To 


it is impossible that every a, from a certain point on should be 9. With 
this reservation., every possible sequence (a,) will arise from some x. 
We define x as the sum of the series (9.1.7), and x, and g,,, as in (9.1.4) 
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and (9.1.5). Then g,,, < 10-” for every n, and x yields the sequence 
required. 

Finally, if 
(9.1.9) 2 = > To 
and the 6, satisfy the conditions already imposed on the a,, then 

= b, for every n. For if not, let ay and by be the first pair which 
a so that |ay—by| > > 1. Then 

S an Oy 1 la, —b nls 

ee > Ton > 10” a Ò p=’ 
1 1 NFL 
This contradicts (9.1.9) unless there is equality. If hess is equality, 
then all of ay,,—by 41; dy4g—Oy 49)... must have the same sign and the 
absolute value 9. :But then either a, = 9 and b, =0 for n > N, or 
else a, = 0 and b, = 9, and we have seen that each of these alternatives 
is impossible. Hence a, = b, for all n. In other words, different deci- 


mals correspond to different numbers. 
We now combine (9.1.1), (9.1.3), and (9.1.8) in the form 


(9.1.10) é = X+x = A, Ap...Ag yyy aag.. 
and we can sum up-our conclusions as follows. 


THEOREM 134. Any positive number £ may be expressed as a decimal 
A, Ao... A s411 Meg... 
where 0 <A, < 10,0 <A, <10,...,0 <a, < 10, 
not all A and a are 0, and an infinity of the a, are iess than 9. If E> 1 


then A, > 0. There is a (1, 1) correspondence between the numbers and 
the decimals, and 


&=: A,.1084...4 Aga ty api 


ia fess cians 


In what follows we shall usually suppose that 0 < < 1, so that 
X = 0, é= a. In this case al] the A are 0. We shall sometimes save 
words by ignoring the distinction between the number x and pi decimal 
which represents it, saying, for example, that the second digit of #5 is 4& 


9.2. Terminating and recurring decimals. A decimal which 
does not terminate may recur. Thus 
g = °3333..., A= +14285714285714...; 
equations which we express more shortly as 
4= 3 } = -[42857. 
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These are pure recurring decimals in which the period reaches back to 
the beginning. On the other hand, 

4 = 1666... = -16, 
a mixed recurring decimal in which the period is preceded by one non- 


recurrent digit. 
We now determine the conditions for termination or recurrence. 


ae eee a 
(1) If z= q BeBe 


where (p, q) = 1, and 
(9.2.1) u = max(a,ß), 


then 10"x is an integer for n = u and for no smaller value of n, so that 
x terminates at a&„. Conversely, 
Dro To? 10”  10¥ 

where q has the prime factors 2 and 5 only. 

(2) Suppose next that x = p/q, (p,q) = 1, and (q, 10) = 1, so that 
q is not divisible by 2 or 5. Our discussion of this case depends upon 
the theorems of Ch. VI. 

By Theorem 88, 1% = 1 (modq) 
for some v, the least such v being a divisor of (q). We suppose that v 


has this smallest possible value, i.e. that, in the language of § 6.8, 10 
belongs to v (modq) or v is the order of 10 (modq). Then 


(9.2.2) 10x = 2 aut ug = mp n = mp4 z, 


where m is an integer. But 
10x = 10r, +19, = Pr, tfn 


by (9.1.4). Since 0 < x < 1, f,,, = x, and the process by which the 
decimal was constructed repeats itself from f,,, onwards. Thus x is a 
pure recurring decimal with a period of at most y figures. 

On the other hand, a pure recurring decimal -d, @,...¢) is equal to 


I 1 
Grr a +B)0+iatiat-] 


_ 10a, +10 Pag +... +a 
1Q\—1 


LP 
q 
when reduced to its lowest terms. Here q 10è— 1, and so A > v. It 
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follows that if (q, 10) = 1, and the order of 10 (mod q) is Y, then x is a 
pure recurring decimal with a period of just v digits; and conversely. 
(3) Finally, suppose that 
Pm eee 
q  2%58Q’ 
where (p,q) = land (Q, 10) = 1; that p is defined as in (9.2.1); and 
that Y is the order of 10 (mod Q). Then 


(9.2.3) 


where p’, X, P are integers and 

0<x< 10, o< P< Q, (P,Q) =1. 
If X > 0 then 10°< X < 108+! for some s < u, and X = A, Áz... A41; 
and the decimal for P/Q is pure recurring and has a period of y digits. 


Hence 10a = A, Ags Ag yyy az, 


and 
(9.2.4) x = +b, baby dy dg...4,, 
the last s+ 1 of the b being A,, A,,..., A,,, and the rest, if any, 0. 
Conversely, it is plain that any decimal (9.2.4) represents a fraction 
(9.2.3). We have thus proved 
Turorem 135. The decimal for a rational number p/q between 0 and 1 
iş terminating or recurring, and any terminating or recurring decimal is 
equal to a rational number. If (p,q) == 1, q = 2°58, and max(a,B) = p, 
then the decimal terminates after p digits. If (p, q) = 1, q = 2°58Q, 
where Q > 1, Q, 10) = 1, and Y is the order of 10 (mod Q), then the 


decimal contains p non-recurring and v recurring digits. 


9.3. Representation of numbers in other scales. There is no 
reason except familiarity for our special choice of the number 10; we 
may replace 10 by 2 or by any greater number r. Thus 


the first two decimals being ‘binary’ decimals or ‘decimals in the scale 
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of 2’, the thirc. a ‘decimal in the scale of 7’.+ Generally, we speak of 
‘decimals in the scale of r’. 

The arguments of the preceding sections may be repeated with certain 
changes, which are obvious if 7 is a prime or a product of different 
primes (like 2 or 10), but require a little more consideration if r has 
square divisors (like 12 or 8). We confine ourselves for simplicity to the 
first case, when our arguments require only trivial alterations. In § 9.1, 
10 must be replaced by r and 9 by r-l. In § 9.2, the part of 2 and 5 is 
played by the prime divisors of r, 


Turorem 136. Suppose that r is a prime or a product of different 
primes. Then any positive number & may be represented uniquely as a 
decimal in the scale of r. An infinity of the digits of the decimal are less 
than r— 1; with this reservation, the correspondence between the numbers 
and the decimals is (l,)). 

Suppose further that 


O0<r<l, x =f, (p,q) = 1. 
If q = s%4P..uY, 


where 8, t,..., u are the prime factors of r, and 


u = max(a, B,- y), 
then the decimal for x terminates at the pth digit. If q is prime to r, and 
vis the order of r (modq), then the decimal is pure recurring and has a 
period of v digits. If 7 stfu (Q> 1), 


Q is prime to r, and v is the order of r (mod Q), then the decimal is mixed 
recurring, and has p non-recurring and v recurring digits.t 


9.4. Irrationals defined by decimals. It follows from Theorem 
136 that a decimal (in any scale||) which neither terminates nor recurs 
must represent an irrational number. Thus 


æ = 0100100010... 


t We ignore the verbal contradiction jnyolved in the use of ‘decimal’ ; there is no 
other convenient word. 


f Generally, when r = 84t%...u°, we must define p as 
a B y 
max G ’ Re a) 
if this number is an integer, and otherwise as the first greater integer. 
|| Strictly, any ‘quadratfrei’ scale (scale whose base is a prime or a product of different 


primes). This is the only case actually covered by the theorems, but there is no difficulty 
in the extension. 
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(the number of 0’s increasing by 1 at each stage) is irrational. We 
consider some less obvious examples. 


THEOREM 137: ‘011010100010..., 
where the digit a, is 1 if n is prime and 0 otherwise, is irrational. 


Theorem 4 shows that the decimal does not terminate. If it recurs, 
there is a function An+B which jg prime for all n from some point 
onwards; and Theorem 21 shows that this also is impossible. 

This theorem is true in any scale. We state our next theorem for 


the scale of 10, leaving the modifications required for other scales to the 
reader. 


THEOREM 138 : *2357111317192329..., 


where the sequence of digits is formed by the primes in ascending order, is 
irrational. 


The proof of Theorem 138 is a little more difficult. We give two 
alternative proofs. 


(1) Let us assume that any arithmetical progression of the form 
k.10%1+41 (k=1, 2,3...) 


contains primes. Then there are primes whose expressions in the decimal 
system contain an arbitrary number s of O’s, followed by a 1. Since 
the decimal contains such sequences, it does not terminate or recur. 


(2) Let us assume that there isa prime between N and 10N for every 
N > 1. Then, given s, there are primes with just s digits. If the decimal 
recurs, it is of the form 


(9.4.1) “eee By Ag yy Agape 


the bars indicating the period, and the first being placed where the 
first period begins. We can choose l> 1 so that all primes with s = kl 
digits stand later in the decimal than the first bar. If p is the first such 
-prime, then it must be of one of the forms 


P = Gy Ay...0y,|0y My...0,)...[0, By... 
or P = Gy ge Bp|Qy Oy. .y|...|y Ug... p [Ay Uy. 


and is divisible by a, ag...a, or by @,,,4.+.0,, 4, My...@,,;@ contradiction. 
In our first proof we assumed a special case of Dirichlet ’s Theorem 15. 
This special case is easier to prove than the general theorem, but we 


5591 1 
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shall not prove it in this book, so that (1) will remain incomplete. In 
(2) we assumed a result which follows at once from Theorem 418 (which 
we shall prove in Chapter XXII). The latter theorem asserts that, for 
every N > 1, there is at least one prime satisfying N < p < 2N. It 
follows, a fortiori, that N < p < 10N. 


9.5. Tests for divisibility. In this and the next few sections we 
shall be concerned for the most part with trivial but amusing puzzles. 

There are not very many useful tests for the divisibility of an integer 
by particular integers such as 2, 3, 5,.... A number is divisible by 2 if 
its last digit is even. More generally, it is divisible by 2” if and only if 
the number represented by its last y digits is divisible by 2”. The reason, 
of course, is that 2” 10”; and there are similar rules for 5 and 5’. 


Next 10” = 1 (mod 9) 
for every v, and therefore 

A,. 108+-A,. 1081+... A,.10+A,,,; = A,+A,+...+A,,, (mod 9). 
A fortiori this is true mod 3. Hence we obtain the well-known rule 
‘a number is divisible by 9 (or by 3) if and only if the sum of its digits 
is divisible by 9 (or by 3%. 

There is a rather similar rule for 11. Since 10 = -1 (mod 1 1), we 
have 107 =], 107+ = -1 (mod 11), 
so that 


A, 10°+-A,.10°14...+4,. 1044s = Agyy—A,+4,4—--- (mod 11). 


A number is divisible by 11 if and only if the difference between the 
sums of its digits of odd and even ranks is divisible by 11. 

We know of only one other rule of any practical use. This is a test 
for divisibility by any one of 7, 11, or 18, and depends on the fact that 
7.11.18 = 1001. Its working is best illustrated by an example: if 
29310478561 is divisible by 7, 11 or 13, So is 

561—478+310—29 = 364 = 4.7.13. 


Hence the original number is divisible by 7 and by 13 but not by 11. 


9.6. Decimals with the maximum period. We observe when 
learning elementary arithmetic that 


4 = -142857, %= -285714, ... $= -857142, 
the digits in each of the periods differing only by a cyclic permutation. 
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Consider, more generally, the decimal for the reciprocal of a prime q. 
The number of digits in the period is the order of 10 (modq), and is a 
divisor of (q) = q- 1. If this order is q- 1, ie. if 10 is a primitive root 
of q, then the period has q- 1 digits, the maximum number possible. 

We convert 1/¢ into a decimal by dividing successive powers of 10 
by q; thus 10” 
qg 10°2, tfno 


in the notation of § 9.1. The later stages of the process depend only 
upon the value of f,,,,;, and the process recurs SO soon as f,,,, repeats a 
value. If, as here, the period contains q- 1 digits, then the remainders 


Sos Js Lg 

must all be different, and must be a permutation of the fractions 
12 q-l 
Gag 


The last remainder f, is 1/q. 
The corresponding remainders when we convert p/q into a decimal are 


Pfa Pf Pha: 


reduced (mod 1). These are, by Theorem 58, the same numbers in a 
different order, and the sequence of digits, after the occurrence of a 
particular remainder s/q, is the same as it was after the occurrence of 
s/q before. Hence the two decimals differ only by a cyclic permutation 
of the period. 
What happens with 7 will happen with any q of which 10 is a primi- 

tive root. Very little is known about these q, but the q below 50 which 
satisfy the condition are 


7,17, 19, 23, 29, 47. 


Tueorem 139. If qisa prime, and 10 is a primitive root of q, then the 


decimals for p 
q (p = 1, 2,...,q—1) 


have periods of length q- 1 and differing only by cyclic permutation. 


9.7. Bachet’s problem of the weights. What is the least number 
of weights which will weigh any integral number of pounds up to 40 
(a) when weights may be put into one pan only and (b) when weights 
may be put into either pan ? 
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The second problem is the more interesting. We can dispose of the 
first by proving 


THEOREM 140. Weights 1, 2, 4 ,.... 2"-1 will weigh any integral weight 
up to 2"— 1; and no other set of SO few as n weights is equally effective (ie. 
will weigh so long an unbroken sequence of weights from 1). 


Any positive integer up to 2”— 1 inclusive can be expressed uniquely 
as a binary decimal of n figures, ie. as a sum 


n-1 
2 428, 


where every a, is 0 or 1. Hence our weights will do what is wanted, 
and ‘without waste’ (no two arrangements of them producing the same 
result). Since there is no waste, no other selection of weights can weigh 
a longer sequence. 

Finally, one weight must be 1 (to weigh 1); one must be 2 (to weigh 
2); one must be 4 (to weigh 4); and so on. Hence 1, 2, 4,...,2"-1 is the 
only system of weights which will do what is wanted. 

It is to be observed that Bachet’s number 40, not being of the form 
2”— 1, is not chosen appropriately for this problem. The weights 1, 2, 
4, 8, 16, 32 will weigh up to 63, and no combination of 5 weights will 
weigh beyond 32. But the solution for 40 js not unique; the weights 
1, 2, 4, 8, 9, 16 will also weigh any weight up to 40. 

Passing to the second problem, we prove 


THEOREM 141. Weights 1, 3, 3?,...,3"-1 will weigh any weight up to 
4(3"— 1), when weights may be placed in either pan; and no other set of so 
few as n weights is equally effective. 


(1) Any positive integer up to 3”—1 inclusive can be expressed 
uniquely by n digits in the ternary scale, i.e. as a sum 
n-1 
$ 4,38, 
0 
where every a, is 0, 1, or 2. Subtracting 


1+3437+...43"-1 = 4(3"—1), 
we see that every positive or negative integer between —1(3»— 1) and 
$(3"— 1) inclusive can be expressed uniquely in the form 


nai 
> b, 35, 
0 


where every b, is — 1, 0, or 1. Hence our weights, placed in either pan, 
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will weigh any weight between these limits.t Since there is no waste, 
no other combination of n weights can weigh a longer sequence. 

(2) The proof that no other combination will weigh so long a sequence 
is a little more troublesorne. It is plain, since there must be no waste, 
that the weights must all differ. We suppose that they are 


W< W<... <w,. 
The two largest weighable weights are plainly 
w= WHW t... +H Wn W = wt. Wy: 
Since W, = W-l, w, must be 1. 
The next weighable weight is 
—w Hw twt. +W = w-2, 
and the next must be l 
Wy Wet Wy+ F Wy. 
Hence w,-+-w3+...+W, = W-3 and w, = 3. 
Suppose now that we have proved that 
w = 1, w= 3,...> w= 3%. 
If we can prove that w,,, = 3°, the conclusion will follow by induction. 
The largest weighable weight W is 


S n 
w= $ wt > w. 
1 s+1 
Leaving the weights w,,,,..., W, undisturbed, and removing some of 


the other weights, or transferring them to the other pan, we can weigh 
every weight down to 


— > w+ $ w= W—(3--1), 
1 s+1 
but none below. The next weight less than this is W—3*, and this 


maint be witwat... HW HW Ws tH Hwn 
Hence Wear = 2w Hwt. Hw) +1 = 3, 
the conclusion required. 

Bachet’s problem corresponds to the case n = 4. 


9.8. The game of Nim. The game of Nim is played as follows. 
Any number of matches are arranged in heaps, the number of heaps, 
and the number of matches in each heap, being arbitrary. There are 
two players, A and B. The first player A takes any number of matches 
from a heap; he may take one only, or any number up to the whole 


ł Counting the weight to be weighed positive if it is placed in one pan and negative 
if itis placed in the other. 
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of the heap, but he must touch one heap only. B then makes a move 
conditioned similarly, and the players continue to take alternately. The 
player who takes the last match wins the game. 

The game has a precise mathematical theory, and one or other player 
can always force a win. 

We define a winning position as a position such that if one player 
P (A or B) can secure it by his move, leaving his opponent Q (B or A) 
to move next, then, whatever Q may do, P can play so as to win the 
game. Any other position we call a losing position. 

For example, the position 
or (2,2), is a winning position. If A leaves this position to B, B must 
take one match from a heap or two. If B takes two, A takes the 
remaining two. If B takes one, A takes one from the other heap; and 
in either case A wins. Similarly, as the reader will easily verify, 

eliana 

or (1, 2, 3), is a winning position. 

We next define a correct position. We express the number of matches 
in each heap in the binary scale, and form a figure F by writing them 
down one under the other. Thus (2, 2), (1, 2,3), and (2, 3, 6,7) give the 


figures 10 0l 010; 

10 10 011 

— 11 110 

20 rm 111 

22 
242 

it is convenient to write 01, 010,... for 1, 10,... SO as to equalize the 
number of figures in each row. We then add up the columns, as jndj- 
cated in the figures. If the sum of each column is even (as in the cases 
shown) then the position is ‘correct’. An incorrect position is one which 
is not correct: thus (1, 3, 4) is incorrect. 

TuEorem 142. A position in Nim is a winning position if and only if 
it is correct. 

(1) Consider first the special case in which no heap contains more 
than one match. It is plain that the position is winning if the number 
of matches left is even, and losing if it is odd; and that the same condi- 
tions define correct and incorrect positions. 

(2) Suppose that P has to take from a correct position. He must 
replace one number dcfining a row of F by a smaller number. If we 
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replace any number, expressed in the binary scale, by a smaller number, 
we change the parity of at least one of its digits. Hence when P takes 
from a correct position, he necessarily transforms it into an incorrect 
position. 


(3) If a position is incorrect, then the sum of at least one column of 
F is odd. Suppose, to fix our ideas, that the sums of the columns are 


even, even, odd, even, odd, even. 


Then there is at least one 1 in the third column (the first with an odd 
sum). Suppose (again to fix our ideas) that one row in which this 
happens is * x 
011101, 

the asterisks indicating that the numbers below them are in columns 
whose sum is odd. We can replace this number by the smaller number 


x 
0181 10, 


in which the digits with an asterisk, and those only, are altered. Plainly 
this change corresponds to a possible move, and makes the sum of every 
column even; and the argument is general. Hence P, if presented with 
an incorrect position, can always convert it into a correct position. 


(4) If A leaves a correct position, B is compelled to convert it into 
an incorrect position, and A can then move SO as to restore a correct 
position. This process will continue until every heap is exhausted or 
contains one match only. The theorem is thus reduced to the special 
case already proved. 


The issue of the game is now clear. In general, the original position 
will be incorrect, and the first player wins if he plays properly. But 
he loses if the original position happens to be correct and the second 
player plays properly.t 


t When playing against an opponent who does not know the theory of the gamc, 
there is no need to play strictly according to rule. The experienced player can play. at 
random until he recognizes a winning position of a comparatively simple type. It is 
quite enough to know that 


1, 2n,2n4+1, n, T-n, 7, 2, 3, 4, 5 
are winning positions ; that 1, 2n+1, 2n42 


is a losing position ; and that a combination of two winning positions is a winning position. 
The winning move is not always unique. The position 


1, 3, 9, 27 
is incorrect, and the only move which makes it correct is to take 16 from the 27. The 
position 3, 5,7, 8, 11 


is also incorrect, but may be made correct by taking 2 from the 3, the 7, op the I], 
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There is a variation in which the player who takes the last match 
loses. The theory is the same SO long as a heap remains containing more 
than one match; thus (2, 2) and (1, 2,3) are still winning positions. We 
leave it to the reader to think out for himself the small variations in 
tactics at the end of the game. 


9.9. Integers with missing digits. There is a familiar paradoxt 
concerning integers from whose expression in the decimal scale some 
particular digit such as 9 is missing. It might seem at first as if this 
restriction should only exclude ‘about one-tenth’ of the integers, but 
this is far from the truth. 


THEOREM 143. Almost all numbersf contain a 9, or any given sequence 
of digits such as 937. More generally, almost all numbers, when expressed 
in any scale, contain every possible digit, or possible sequence of digits. 

Suppose that the scale is r, and that v is a number whose decimal 
misses the digit b. The number of v for which r! < v < rl is (r- 1) if 
b = 0 and (r— 2\(r— 1)-1 if b 4 0, and in any case does not exceed 

—l1y i 
(r 1) . Hence, if rk-1 < n< rk 
the number N(n) of v up to n does not exceed 

r—1+(r—1)?+...+(r—1)* < k(r—1)*; 


_)\ —]\k 
aa N(n) < pE 1) 2 br" ‘ 
n r 


gel 


which tends to 0 when n > œ. 

The statements about sequences of digits need no additional proof, 
sinae, for example, the sequence 937 in the scale of 10 may be regarded 
as a single digit in the scale of 1000. 


The ‘paradox’ is usually stated in a slightly stronger form, viz, 
Teorem 144, The sum of the reciprocals of the numbers which miss a given 
digit is convergent. 
The number of y between r*—1 and 7% is at most (r—1)*, Hence 
wo 
1 1 


v v 
k=1 rk-lgpcrk 
ire 


foe} 
-1)” —1 k-1 
< > see = (r—1) > (=) = r(r— 1). 
ci 7 rast 


We shall discuss next some analogous, but more interesting, properties 


t Relevant in controversies about telephone directories. 
} In the sense of § 1.6. 
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of infinite decimals. We require a few elementary notions concerning 
the measure of point-sets or sets of real numbers. 


9.10. Sets of measure zero. A real number x defines a ‘point’ of 
the continuum. In what follows we use the words ‘number’ and ‘point’ 
indifferently, saying, for example, that ‘P is the point x’. 

An aggregate of real numbers is called a set of points. Thus the set 


T defined by l 
x=- (n=1,2,3,...) 


the set R of all rationals between 0 and 1 inclusive, and the set C of 
all real numbers between 0 and 1 inclusive, are sets of points. 

An interval (z-6, x-+-8), where 6 is positive, is called a neighbourhood 
of x. If S is a set of points, and every neighbourhood of x includes an 
infinity of points of S, then x is called a limit point of S. The limit point 
may or may not belong to S, but there are points of S as near to it 
as we please. Thus T has one limit point, x = 0, which does not belong 
to T. Every x between 0 and 1 is a limit point of R. 

The set S’ of limit points of S is called the derived set or derivative 
of S. Thus C is the derivative of R. If S includes 9’, i.e. if every limit 
point of S belongs to S, then S is said to be closed. Thus C is closed. 
If © includes S, i.e..if every point of S is a limit point of S, then S is 
said to be dense in itself. If S and S’ are identical Go that S is both 
closed and dense in itself), then S is said to be Perfect. Thus C is perfect. 
A less trivial example will be found in § 9.11. 

A set S is said to be dense in an interval (a, b) if every point of (a, b) 
belongs to S’. Thus R is dense in (0, 1). 

If S can be included in a set J of intervals, finite or infinite in number, 
whose total length is as small as we please, then S is said to be of measure 


zero, Thus T is of measure zero, We include the point 1/n in the interval 
1 1 
~— 2-0-18, -+ 2-7-1§ 
n nt 


of length 2-78, and the sum of all these intervals (without allowance 
for possible overlapping) is 


8 ¥ 2-n = ô, 
1 


which we may suppose as small as we please. 
Generally, any enumerable set is of measure zero, A set is enumerable 
if its members can be correlated, as 


(9.10.1) Lyy gren Xpres 
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with the integers 1,2 ,...,n p... We include 2, in an interval of length 
2-"$, and the conclusion follows as in the special case of T. 

A subset of an enumerable set is finite or enumerable. The sum of 
an enumerable set of enumerable sets is enumerable. 

The rationals may be arranged as 


121383 12 


19 [> 22> 3> 3) 4> 49 5 5B» Boers 


and So in the form (9.10.1). Hence R is enumerable, and therefore of 
measure zero. A set of measure zero is sometimes called a null set; 
thus R is null. Null sets are negligible for many mathematical purposes, 
particularly in the theory of integration. 

The sum S of an enumerable infinity of null sets S, (ie. the set formed 
by all the points which belong to some S,,) is null. For we may include 
S,,in a set of intervals of total length 27”, and so S in a set of intervals 
of total length not greater than § 5 2-" = $, 

Finally, we say that almost all points of an interval I possess a pro- 
perty if the set of points which do not possess the property is null. 
This sense of the phrase should be compared with the sense defined 
in § 1.6 and used in § 9.9. It implies in either case that ‘most’ of the 
numbers under consideration (the positive integers in §§ 1.6 and 9.9, the 
real numbers here) possess the property, and that other numbers are 
‘exceptional’.f 


9.11. Decimals with missing digits. The decimal 
i= -142857 


has four missing digits, viz. 0, 3, 6, 9. But it is easy to prove that 
decimals which miss digits are exceptional. 

We define S as the set of points between 0 (inclusive) and 1 (exclusive) 
whose decimals, in the scale of r, miss the digit 6. This set may be 
generated as follows. 

We divide (0, 1) into r equal parts 

s 1 

-<a< ae (s = 0, 1,...,7—1); 

r r 
the left-hand end point, but not the right-hand one, is included. The 
sth part contains just the numbers whose decimals begin with s-l, 

t Our explanations here contain the minimum necessary for the understanding of 
§§ 9.11-13 and a few later passages in the book. In particular, we have not given any 


general definition of the measure of a set. There are fuller accounts of gj] these ideas in 
the standard treatises on analvsis, 
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and if we remove the (b+ 1)th part, we reject the numbers whose first 
digit is b. 

We next divide each of the y— 1 remaining intervals into 7 equal parts 
and remove the (b+l)th part of each of them. We have then rejected 
all numbers whose first or second digit is b. Repeating the process 
indefinitely, we reject all numbers in which any digit is b; and S is the 
set which remains. 

In the first stage of the construction we remove one interval of length 
l/r; in the second, r-l intervals of length 1/r*, i.e. of total length 
(r— 1)/r?; in the third, (r— 1)? intervals of total length (r—1)?/r3; and 
so on. What remains after k stages is a set Jẹ of intervals whose total 


length is £ (py 


]— m 


> 


[=i 
and this set includes S for every k. Since 


ae eee 


when k > œ, the total length of J, is small when k is large; and S is 
therefore null. 


Tuzorem 145. The set of points whose decimals, in any scale, miss 
any digit is null: almost all decimals contain all possible digits. 


The result may be extended to cover combinations of digits. If the 
sequence 937 never occurs in the ordinary decimal for x, then the digit 
‘937’ never occurs in the decimal in the scale of 1000. Hence 


Tuzorem 146. Almost all decimals, in any scale, contain all possible 
sequences of any number of digits. 


Returning to Theorem 145, suppose that r = 3 and b = 1. The set 
S is formed by rejecting the middle third (4, 4) of (0, 1), then the middle 
thirds (}, 4), (4, $) of (0, į) and (3, 1), and so on. The set which remains 
is null. 

It is immaterial for this conclusion whether we reject or retain the 
end points of rejected intervals, since their aggregate is enumerable and 
therefore null. In fact our definition rejects some, such as 4 = +1, and 
includes others, such as 4 = -2. 

The set becomes more interesting if we retain all end points. In this 
case (if we wish to preserve the arithmetical definition) we must allow 
ternary decimals cnding in Ê (and excluded in our account of decimals 
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at the beginning of the chapter). All fractions p/3" have then two 
representations, such as 

I} =] = -02 
(and it was for this reason that we made the restriction); and an end 
point of a rejected interval has always one without a 1. 

The set S thus defined is called Cantors ternury set. 

Suppose that x is any point of (0, 1), except 0 or 1. If x does not 
belong to S, it lies inside a rejected interval, and has neighbourhoods 
free from points of S, so that it does not belong to S. If x does belong 
to S, then all its neighbourhoods contain other points of S; for other- 
wise there would be one containing x only, and two rejected intervals 
would abut, Hence x belongs to S. Thus S and S’ are identical, and 
x is perfect. 


Tueorem 147. Cantor’s ternury set isa perfect set of measure zero. 


9.12. Normal numbers. The theorems proved in the last section 
express much less than the full truth. Actually it is true, for example, 
not only that almost all decimals contain a 9, but that, in almost all 
decimals, 9 occurs with the proper frequency, that is to say in about 
one-tenth of the possible places. 

Suppose that x is expressed in the scale of 7, and that the digit b occurs 
Nn, times in the first n places. If 

Ny 

a 
when n > œ, then we say that b has frequency $. It is naturally not 
necessary that such a limit should exist; n,/n may oscillate, and one 
might expect that usually it would. The theorems which follow prove 
that, contrary to our expectation, there is usually a definite frequency. 
The existence of the limit is in a sense the ordinary event. 

We gay that x is simply normal in the scale of r if 


(9.12.1) m5 
n r 


for each of the r possible values of b. Thus 
x = -0123456789 
is simply normal in the scale of 10. The same x may be expressed in the 
scale of 101°, when its expression is 
x=}, 
where b = 123456789. It is plain that in this scale x is not simply 
normal, 10!9— 1 digits being missing. 
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This remark leads us to a more exacting definition. We say that x is 
normal in the scale of r if all of the numbers 


X, r£, 1°X,...F 
are simply normal in all of the scales 
A A E 


It follows at once that, when x is expressed in the scale of r, every 
combination bi bu Dy 


of digits occurs with the proper frequency; i.e. that, if n, is the number 
of occurrences of this sequence in the first n digits of x, then 


n 1 
(9.12.2) by 


not 
when n>, 
Our main theorem, which includes and goes beyond those of § 9.11, is 


j THEOREM 148. Almost all numbers are normal in any scale. 


9.13. Proof that almost all numbers are normal. It is sufficient 
to prove that almost all numbers are simply normal in a given scale. 
For suppose that this has been proved, and that S(x,r) is the set of 
numbers x which are not simply normal in the scale of r, Then §(z, r), 
S(x,1r?), S(x,73),... are null, and therefore their sum is null. Hence the 
set T(x, r) of numbers which are not simply normal in al] the scales 
r,7?,... is null, The set 7'(rz,r) of numbers such that rx is not simply 
normal in all these scales is also null; and so are T(r?x, r), T'(r3z,r),... - 
Hence again the sum of these sets, i.e. the set U(x,1r) of numbers which 
are not normal in the scale of r, is null. Finally, the sum of U(z, 2), 
U(a, 3),... is null; and this proves the theorem. 

We have therefore only to prove that (9.12.1) is true for almost al] 
numbers x. We may suppose that n tends to infinity through multiples 
of r, since (9.12.1) is true generally if it is true for n sO restricted. 

The numbers of r-ary decimals of n figures, with just m b’s in assigned 
places, is (r- 1)"-™. Hence the number of such decimals which contain 
just mb’s, in one place or another, ist 


t 
p(m,m) = —_ (rem, 

mi(n—m)! 

+ Strictly, the fractional parts of these numbers (sinco we have been considering 
numbers hetween O and 1). A numher greater than 1 is simply normal, or normal, if 
its fractional part is simply normal, or normal. 

t p(n, m) is the term in (r—1)"~™ in the binomial expansion of 


(1+(r—1)}". 
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We consider any decimal, and the incidence of b’s among its first n 
digits, and call m 
La m—n* 


the n-excess of b (the excess of the actual number of b’s over the number 
to be expected). Since n is a multiple of r, n* and u are integers. Also 
l 


(9.13.1) shore rek 
T n r 


We have 


p(n, m-+1) n-m (r- 1)n—rp 


(9.132) PAMTI) mem v- im- 
pin,m)  @- D+ 1) ™ Dn tr(r— Det IY 


pamte gys si 229, 2. PM pa 
pin, m) 
so that p(n, m) is greatest when 
p= 0, m = në. 
If p > 0, then, by (9.13.2), 
p(n,m-+1) (r- I)n—rp 


(9.13.3) pnm) ~ imre ati) 


If p <0 and v= |p], then 
p(n,m—l1) (r- 1)m _ @- 1)n—r(r— 1)v 


(9.13.4) re a 
pin, m) n—m-+1 (r—1)n+7(v-+1) 


rv rv riu] 
l—— ——} = — |}, 
< n < exp z) exp 5 ) 
We now fix a positive ẹ, and consider the decimals for which 
(9.13.5) |a] > én 


for a given n. Since n is to be large, we may suppose that |u] > 2. 
If u is positive then, by (9.13.3), 


p(n, m) _ p(n, m) p(n, m—1) p(n, m—p+1) 
p(n,m—p) T p(n, m- 1) p(n, m-2) *" p(n,m—p) 


< e x[p r (eV) +(u-2) +. +1 
r-l 


n 


— exp —16=7E] < e-Epm 
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where K is a positive number which depends only on r. Since 


p(n, m—p) = p(n, n*) <rF 
it follows that 
(9.13.6) p(n, m) < rte-Kpin, 
Similarly it follows from (9.13.4) that (9.13.6) is true also for negative pu. 
Let S,() be the set of numbers whose n-excess is u. There are 
p = p(n, m) numbers §,, Éz p., £p represented by terminating decimals 
of n figures and excess u, and the numbers of §,,(u) are included in the 


i l 
mbeEvale fy EHT (s = 1, 25D). 
Hence S,,(2) is included in a set of intervals whose total length does not 
exceed rp (n, m) < e-Kxtin, 
And if 7,,(5) is the set of numbers whose n-excess satisfies (9.13.5), 
then 7;,(8) can be included in a set of intervals whose length does not 
exceed 

> e-Ken — 2 Y o-Kutin < 2 5 e-EKptng-AK yin < 2e-tKn $ e-tEuin 
lu> ôn p2on p2on p=0 


2e-4Kô?n 
= Pam < Ene tee, 


where L, like K, depends only on r, 
We now fix N (a multiple N*r of r), and consider the set U,(8) of 
numbers such that (9.13.5) is true for some 
n= nr >N = N*r, 
Then U,(8) is the sum of the sets 


Ty (8), Ty +,(8), Ty +2r(5), 0065 
i.e. the sets T (8) for which n = ky and k > N*. It can therefore be 
included in a set of intervals whose length does not exceed 


L 5 kre-+Kòkr — (N *); 
KÉN’ 


and (N *) -> 0 when n* and N* tend to infinity. 

If U(S) is the set of numbers whose n-excess satisfies (9.13.5) for an 
infinity of n (all multiples of r), then U(S) is included in U,({8) for 
every N, and can therefore be included in a set of intervals whose total 
length is as small as we please. That is to say, U(6) is null. 

Finally, if x is not simply normal, (9.12.1) is false (even when n is 
restricted to be a multiple of r), and 


|u| > gn 
t Indeed p(n, m) <r" for all m, 
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for some positive { and an infinity of multiples n of r, This ¢ is greater 
than some one of the sequence 5, $5, }4,..., and SO x belongs to some one 


Pree U(8), UGS), UGS), 
all of which are null. Hence the set of all such x is null. 

It might be supposed that, since almost all numbers are normal, it 
would be easy to construct examples of normal numbers. There are in 
fact simple constructions; thus the number 


-123456789101112..., 


formed by writing down all the positive integers in order, in decimal 
notation, is normal. But the proof that this is so is more_troubles- 
than might be expected. 


NOTES ON CHAPTER IX 


§ 9.4. For Theorem 138 see Pélya and Szego, ii. 160, 383. The result is stated 
without proof in W. H. and G. C. Young’s The theory of sets of points, 3. 

§ 9.5. See Dickson, History, i, ch. xii. The test for 7, 11, and 13 is not mentioned 
explicitly. It is explained by Grunert, Archiv der Math. und Phys. 42 (1864), 
478-82. Grunert gives slightly earlier rcferences to Brilka and V. A. Lebesgue. 

§§ 9.7-8. See Ahrens, ch. iii. 

There is an interosting logical point involved in the definition of a ‘losing’ 
position in Nim. We define a losing position as one which is not a winning position, 
i.e. as a position such that P cannot force a win by lcaving it to Q. It follows 
from our analysis of the game that a losing position in this sense is also a losing 
position in the sense that Q ean force a win if P leaves such a position to Q. This 
is a case of a general theorem (due to Zermelo and von Neumann) true of any 
game in which there are only two possible results and only a finite choice of 
‘moves’ at any stage. See D. König, Acta Univ. Hungaricae (Szeged), 3 (1927), 
121-30. 

§ 9.10. Our ‘limit point’ is the ‘limiting point’ of Hobson’s Theory offunctions 
of a real variable or the ‘Haufungspunkt’ of Hausdorffs Mengenlehre. 

§§ 9.12-13. Niven and Zuckerman (Pacific Journal of Math. 1 (1951), 103-9) and 
Cassels (ibid. 2 (1952), 555-7) give proofs that, if (9.12.2) holds for every sequencc 
of digits, then yz is normal. This is the converse of our statement that (9.12.2) 
follows from the definition; the proof of this converse is not trivial. 

For the substance of these sections see Borel, Leçons sur la théorie des fonctions 
(2nd ed., 1914), 182-216. Theorem 148 has been developed in various ways since 
it was originally proved by Borel in 1909. Full references will be found in 
Koksma, 116-18. 

Champernowne (Journal London Math. Soc. 8 (1933), 254-60) proved that 
"123... is normal. Copeland and Erdés (Bulletin Amer. Math. Soc. 52 (1946), 
857-60) proved that, if a., @,,... is any increasing sequenco of integers such that 
a, < nite for every € > O and n > n,{e), then tho decimal 


"Ay Ay Ay... 
(formed by writing ont the digits of the a, in any scale in ordcr) is normal in 
that scale, 


X 
CONTINUED FRACTIONS 
10.1. Finite continued fractions. We shall describe the function 


(10.1.1) a+ 


l 
Tag 
of the N+ 1 variables 
ag Miye Aggies QN, 
as a finite continued fraction, or, when there is no risk of ambiguity, 
simply as a continued fraction. Continued fractions are important in 
many branches of mathematics, and particularly in the theory of ap- 
proximation to real numbers by rationals. There are more general types 
of continued fractions in which the ‘numerators’ are not all 1’s, but we 
shall not require them here. 
The formula (10.1.1) is cumbrous, and we shall usually write the 
continued fraction in one of the two forms 


1 1 1 
a — wer ,., — 
Tat A+ ay 
or [ay lap dy]. 


We call a,, a,,..., @y the partial quotients, or simply the quotients, of the 


continued fraction. 
We find by calculation thatt 


g = %%+l Ay Oy aota + ig. 
la,] = a [4 a] = = [ao a; a] =2 OEA 0; 
and it is plain that 
1 
(10.1.2) [aoa] = a+, 
ay 


1 
(10.1.3) [ao ay gees an-wy a, | = [av ay peg By oy an-ı ne ’ 
n 
t There is a clash between our notation here and that of § 6.11, which we shall use 
again later in the chapter (for example in § 10.5). In § 6.11, [x] Wag defined as the integral 
part of æ; while here [a,,] means simply a,. The ambiguity should not confuse the 
reader, since we use [@}] here merely as a special case of [4 a,,..., an]: The square bracket 
in this sense will seldom occur with a single letter inside it, and will not then be im- 
portant. 
5591 K 
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T [ajay] Agy A 
for 1 <n <N. We could define our continued fraction by (10.12) 
and either (10.1.3) or (10.1.4). More generally 


(10.1.4) [dp, Aisee an] = = [to [a,, Mosses dnl], 


(10.1.5) [do t pee a,] = [qs 015:1 Am- [Am Amt- Anl] 
forl<m<n<N. 


10.2. Convergents to a continued fraction. We call 
[ao aisan] (O<n< N) 


the nth convergent to [ao 4, ,...,@y]. It is easy to calculate the con- 
vergents by means of the following theorem. 


THEOREM 149. If p, and q, are defined by 
(10.2.1) po = a, Py = Qot l, Pn = an PyatPn-2 (2 <n < N), 


(10.2.2) q = 1, q=a, Gn = AndnatIn-a (2<n< WN), 
then 
(10.2.3) [ao a., 1 = Ee 


n 


We have already verified the theorem for n = 0 and n = 1. Let us 
suppose it to be true for n < m, where m < N. Then 


[8n oy nad = Be = Gait, 
and Pm—1» Pm—2 Im-1» fm- depend only on 
Qg, lise Oey 
Hence, using (10.1.3), we obtain 
1 


[2o Qis Amt Ams Am] = oe Ars Vyas Om | 
Am+ 


i 
(n + a Pm-1 +Pm-2 


(« m ta ~~} Ym—-1+Um~2 
Om +1 


An (Gin Pm—1 F Pm-2) + Pm-1 
E Amn+(amIm-1 F Um—2) t Im—1 
__ Omii Pm PPms. 

T amy fmtHim-a Imir 
and the theorem is proved by induction. 


10.2 (150-1)] CONTINUED FRACTIONS 131 
It follows from (10.2.1) and (10.2.2) that 


(10.2.4) Pn UnPnrtPn-2 


Mn An U-1 + In-2 i 
Also 


Pn Qn-1— Pn-1 Yn = (an Pr—1t+Pn—2)In—1— Pn—1(On Yn-1 + 4n-2) 
= (Pa Qn-2—Pn—2 Yn—1)+ 
Repeating the argument with n-l, n-2,..., 2 in place of n, we obtain 


PrIn-a—Pn-adn = (~L "(PiGo- Po) = (—1)". 
Also 
Pn In-2—Pa-2 UM = (an Pn—1 + Pn—2)9n-2— Pn—2(4n In-1 + In—2) 
= An (Pat In-2—Pn—2 Yn-1) = (- 1)"a,. 
Torm 150. The functions p, and g, satisfy 


(10.2.5) PaIn—a—Pn—1dn = (—1)"1 
or 1 
(10.24 Pr E _ (1? 


Qn qn-1 al In 
Torm 151. They also satisfy 


(10.2.7) Pn In—2— Pn—-2 In = (—1)"a,, 
or 

as n 
(10.2.8) Pn Pn» _ L 1)"a,, 


Qn In-  Yn-2In 


10.3. Continued fractions with positive quotients. We now 
assign numerical values to the quotients a,, and so to the fraction 
(10.1.1) and to its convergents. We shall always suppose that 
(10.3.1) a> 0, uw ay > 9,7 
and usually also that a, is integral,in which case the continued fraction 
is said to be simple. But it is convenient first to prove three theorems 
(Theorems 152-4 below) which hold for all continued fractions in which 
the quotients satisfy (10.3.1). We write 


Ly =, X = ty; 
qn 
so that the value of the continued fraction is £y or x. 
It follows from (10.1.5) that 


(10.3.2) @ = [to tpn ay] = [do 01 Fn—as [äp n4 Bn] 
as [a,, Os ty eees Ay |\PnrtPn—s 


~ [an p+ ay Un- tln- 


fr 2 <n <N. 
t % may be negative. 
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THEOREM 152. The even convergents %,„ increase strictly with n, while 
the odd convergents 2,,, decrease strictly. 


Teorem 153. Every odd convergent is greater than any even convergent. 


THEOREM 154. The value of the continued fraction ts greater than that 
of any of its even convergents and less than that of any of its odd convergents 
(except that it is equal to the last convergent, whether this be even or odd). 


In the first place every gq, is positive, so that, after (10.2.8) and 
(10.3.1), x,—2,_» has the sign of (—])", This proves Theorem 152. 
Next, after (10.2.6), tn — £n- has the sign of (—1)"~-1, so that 


(10.3.3) Loms1 > Tem 
If Theorem 153 were false, we should have 2,,,,, < %»,, for some pair 
m, p. If u <m, then, after Theorem 152, %,,,.; < om, and if u > m, 
then Za < Loy; and either inequality contradicts (10.3.3). 

Finally, x = xy is the greatest of the-even, or the least of the odd 
convergents, and Theorem 154 is true in either case. 


10.4. Simple continued fractions. We now suppose that the a, 
are integral and the fraction simple. The rest of the chapter will be con- 
cerned with the special properties of simple continued fractions, and 
other fractions will occur only incidentally. It is plain that p, and q, 
are integers, and q, positive. If 


[ao Ais Agee, ay] == g, 
In 


we say that the number x (which is necessarily rational) is represented 
by the continued fraction. We shall see in a moment that, with one 
reservation, the representation is unique. 

THEOREM 155. q, > qn for n > 1, with inequality when n > 1. 

THEOREM 156. q, > n, with inequality when n > 3. 

In the first place, q} = 1, q, = a, > 1. Ifn > 2, then 

In = On In~-1t+9n—2 3 Gratl, 

so that Gn ? Ini and Yn = D If n > 3, then 


In 2 In-1+-4n-2 > Gratl > n, 
and SO g, > n. 
A more important property of the convergents is 
THEOREM 157. The convergents to a simple continued fraction are in 
their lowest terms. 
For, by Theorem 150, 


d| Pa- djan > d| (—1) > d] 1. 
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10.5. The representation of an irreducible rational fraction 
by a simple continued fraction. Any simple continued fraction 
[a,, a, .. . ay] represents a rational number 

x= tN: 
In this and the next section we prove that, conversely, every positive 
rational x is representable by a simple continued fraction, and that, 
apart from one ambiguity, the representation is unique. 

Tueorem 158. If x is representable by a simple continued fraction with 
an odd (even) number of convergents, it is also representable by one with 
an even (odd) number. 

For, if a, > 2, 

[ag Qy3---5a,] = [49,41,-.,4,—1, H, 
while, if a, = 1, 


[to a, n-i 1] = [ao a, In—2 tnt I}. 


For example [2,2,3] = [2,2,2,1]. 
This choice of alternative representations is often useful. 
We call Gn = [an aay] (O< N) 


the n-th complete quotient of the continued fraction 
[ao Qies apye ay]. 
r “+l 
ai 


+ 
Thus X= a, 


and 


On Pn-1FPn-2 
(10.5.1) gee ee re mie NY, 
an In—1tIn-2 D i 
Taeorem 159. a, = [a], the integral part of ap t except that 
ay = [ax-1]— 1 
when ay = 1. 
If N = 0, then a, = a = [ag]. If N > 0, then 


A 1 
ün = Ant 0 <n < N-D. 
An+1 
Now a,,,) 1 (0<n<N-l) 
except that a, = 1 when n = N- land ay = 1. Hence 
(10.52) a, < a, < a,+1 (0 <n < N-1) 
and a, = [0a] © <n < N-1) 


t We revert here to our habitual use of the square bracket in accordance with the 
definition of § 6.11. 


134 CONTINUED FRACTIONS [Chep. X 


except in the case specified. And in any case 
ay = Ay = [ay]. 
THEOREM 160. If two simple continued fractions 
[t0 01an] [Bg bibar] 
have the same value x, and ay > 1, bọ > 1, then M = N and the fractions 
are identicul. 
When we say that two continued fractions are identical we Mean that 
they are formed by the same sequence of partial quotients. 
By Theorem 159, ay = [x] = bọ Let us suppose that the first n 
partial quotients in the continued fractions are identical, and that 
Ahs br are the nth complete quotients. Then 


F , 
x= [ao ayses an-1 an] = [a,, Ayes, An—1) ba]. 


If n = 1, then diac = ath, 
ay by 


a, = bj, and therefore, by Theorem 159, a, = b,. If n > 1, then, by 


(10.5.1), An Pn-rtPn—2 e ban Pn—-1+Pn—2 


An In-1 + Fn-2 Dn Yn-1 + In-2 i 
(an —bn)(Pn-1In-2—Pn-24n-1) = 0. 
But Pa-1ln-2—Pn-24n-1 = (— 1)”, by Theorem 150, and so a, = bp. It 
follows from Theorem 159 that a, = b,- 
Suppose now, for example, that N < M. Then our argument shows 
that 


a, = by 
for n <N. If M > N, then 
PN byi PNPN- 
EN = [a,, Q, By] = [Q0 A, p Ay; Ong ayeees = NENTENG 
dw 1p N] [ 0 yun ANd NGL M Bis re ae 
by (10.5.1); or Pn Iy- Py- In = 9, 


which is false. Hence M = N and the fractions are identical. 
10.6. The continued fraction algorithm and Euclid’s algo- 
rithm. Let x be any real number, and let a, = [x]. Then 
B= Ag+, 0<& <1, 
If €, 4 0, we can write 


= ay, [a] = ay, a; = a,+é,, 0 < é < l. 


If é #0, we can write 
l 


a a as 0<é,<1, 
1 
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and so on. Also a, = 1/€,_, > 1, and so a, > 1, for n > 1. Thus 
; l 1 ; ’ 
t= eles ea slez» az] = [ao Ays Qg, az] =, 


where a, 4,,.. are integers and 
a, > 0, My > Oi... . 


The system of equations 


x = dot Eo (0 < & < 1), 

z =a=a4+é, (0<&<)) 

z =—a=ati OK<]; 
1 


is known as the continued fraction algorithm. The algorithm continues 
so long as €, + 0. If we eventually reach a value of n, say N, for which 
éy = 0, the algorithm terminates and 
x —} [a a4, layses ay]. 

In this case x is represented by a simple continued fraction, and is 
rational. The numbers a, are the complete quotients of the continued 
fraction. 

THeoremM 161. Any rational number can be represented by a finite 
simple continued fraction. 

If x is an integer, then & = 0 and x = a,. If x is not integral, then 

h 
== E rT 


where h and k are integers and k > 1, Since 


x 


ne aot Eo h= agk+égk, 


k 
a, is the quotient, and k} = £ k the remainder, when his divided by k. 
? 1 k 
If 0, then Q =-= 
Eo Æ 1 éo kı 


t The ‘remainder’, here and in what follows, is to be non-negative (here positive). 
If a, > 0, then g and k are positive and kj is the remeinder in the ordinary gense of 
arithmetic. If a, < 0, then yand & are negative and the ‘remainder’ is 

(w—[z])k. 
Thus if k = -7, k = 5, the ‘remeinder’ is 
(—3-[—3)5 = (—4+2)5 = 3. 
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k 
and k= atés k = a ki +ġ ki; 
1 
thus a, is the quotient, and k, = é kı the remainder, when k is divided 
by k,. We thus obtain a series of equations 
h = agk+k,, k = a, kitko, ky = aaka 4 kz, 
continuing so long as &, Æ 0, or, what is the same thing, so long as 
Rast # 0. 

The non-negative integers k, ky k,, . . .form a strictly decreasing sequence, 
and so ky,,= 0 for some N. It follows that £y = 0 for some N, and that 
the continued fraction algorithm terminates. This proves Theorem161. 

The system of equations 

h = ak+k, (0 < k < k), 
k = ak, +k, (0 < ka < kı), 
ky» Soy An-1 kytky (9 < ky < ky), 
ky-1= üy Éy 
is known as Euclid’s algorithm. The reader will recognize the process 
as that adopted in elementary arithmetic to determine the greatest 
common divisor ky of h and k. 
Since £y = 0, ay = a,; also 
l 1 
0 — = — = &y_. l, 
< ea a Ey < 
and SO ay > 2. Hence the algorithm determines a representation of 
the type which was shown to be unique in Theorem 160. We may always 
make the variation of Theorem 158. 

Summing up our results we obtain 

THEOREM 162. A rational number can be expressed as a finite simple 
continued fraction in just two ways, one with an even and the other with 
an odd number of convergents. In one form the last partial quotient is 1, 
in the other it is greater than 1. 


10.7. The difference between the fraction and its convergents. 
Throughout this section we suppose that N > 1 and n > 0. By (10.5.1) 


+ 
t= an+ Pnt+Pn-1 
i a g 3 
Ons ln tin- 


for 1 <n < N-l, and so 
Pn 2 Pn In—-1 77 Pn-1 In (—1)" 


In ~ In(@n+1 atini) Inl@nsa In t+Un-1) i 
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Also g—22 — x-a, = 


If we write 
(10.7.1) qi = 4%, Gn = %MIn1ttGr-2 ANN) 
(so that, in particular, q'y = qy), we obtain 

Tuzorem 163. Ifl <n < N—1, then 


In In In+1 
This formula gives another proof of Theorem 154. 
Next, Ont < Anti < Out] 


for n < N-2, by (10.5.2), except that 
aN- = @y4+1 


when ay = 1. Hence, if we ignore this exceptional case for the moment, 
we have 


(10.7.2) We A <Kacatl<eg 

and 

(10.7.3) Inti = "Ans Intin > an+1lntHIn-1 = Inv 

(10.7.4) Ynsy < Qn41IntInatIn = InsitIn S Ans2Insi tn = Inv 
for 1 <n < N-2: It follows that 


1 
(10.7.5) Tag (Pala t| < (n < N—2), 
Inte dnr 
while 
1 
(10.7.6) |Py-a-Qv-at| = —, Pynt = 0. 
qn 


In the exceptional case, (10.7.4) must be replaced by 
dy- = (Qy-rtldy-2t9y-3 = In-itn-2 = Wy 
and the first inequality in (10.7.5) by an equality. In any case (10.7.5) 


shows that |p,,—q,, «| decreases steadily as n increases; a fortiori, since 


qn increases steadily, 
—Pr 


dn 


T 


decreases steadily. 
We may sum up the most important of our conclusions in 


Tuzorem 164. If N > 1, n > 0, then the differences 


Laa Qa — Pn 
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decrease steadily in absolute value as n increases. Also 


(—1)"6,, 
t—Py, = ———; 
In 4 Un+1 
where o<é,<1 (1 Sn Kwa), bya = 1 
and 
1 1 
(10.7.7) e—Pn| < <4 
Inl ` nna È 


for n <N-1, with inequality in both places except when n = N-. 


10.8. Infinite simple continued fractions. We have considered 
so far only finite continued fractions; and these, when they are simple, 
represent rational numbers. The chief interest of continued fractions, 
however, lies in their application to the representation of irrationals, 
and for this infinite continued fractions are needed. 

Suppose that a,, a,, @,... is a sequence of integers satisfying (10.3. 1), 
so that ES [ g yrs» dy] 
is, for every n, a simple continued fraction representing a rational 
number x,. If, as we shall prove in a moment, x, tends to a limit x 
when n —> œ, then it is natural to say that the simple continued fraction 


(10.8.1) [ao 44; Agel 
converges to the value x, and to write 
(10.8.2) X = [0o 0 2a, ]. 
Torem 165. If a,, a,, Qg p.» 1S a sequence of integers satisfying 


(10.3.1), then x, = [ao Gy y...,0,] tends to a limit x when n > . 
We may express this more shortly as 


THEOREM 166. All infinite simple continued fractions are convergent. 
we write Ty = a = [to tan], 
n 


as in § 10.3, and call these fractions the convergents to (10.8.1). We 
have to show that the convergents tend to aiimit. 

If N > n, the convergent x, is also a convergent to [a,, a,...., ay]. 
Hence, by Theorem 152, the even convergents form an increasing and 
the odd convergents a decreasing sequence. 

Every even convergent is less than 2,, by Theorem 153, so that the 
increasing sequence of even convergents is bounded above; and every 
odd convergent is greater than 2p, SO that the decreasing sequence of 
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odd convergents is bounded below. Hence the even convergents tend 

to a limit £, and the odd convergents to a limit ¢,, and é < &,. 
Finally, by Theorems 150 and 156, 

1 1 

den Yani] = Y2nQ2n-1 — 2n(2n—1) 

so that é = é = x, say, and the fraction (10.8.1) converges to x. 
Incidentally we see that 


Pon Pen-1 > 0, 


==” Torm 167. An infinite simple continued fraction is less than any 
of its odd convergents and greater than any of its even convergents. 


Here, and often in what follows, we’use ‘the continued fraction’ as 
an abbreviation for ‘the value of the continued fraction’. 
10.9. The representation of an irrational number by an infinite 
continued fraction. We call 
Ay, = [an Ansty] 


the n-th complete quotient of the continued fraction 


x = [o tp]. 
Clearly a, = lim [a,, a,+,,..., ay | 
N> i 
7 1 
= a,+ lim ——— = @ += 
N-o [äris ay] n dny 
; . ; 1 
and in particular X= Q= wtr 
1 
, , 1 
Also a, > An; an+ > Oni > 0, 0<—<1,; 
n+l 


and so a, = [ap]. 
~ Turon 168. If /a,,, dy, ,... |= x, then 
a= [e], a, = [a] m > O. 
From this we deduce, as in § 10.5, 
~ Tueorem 169. Two infinite simple continued fractions which have the 
same value are identical. 
We now return to the continued fraction algorithm of § 10.6. If x 


is irrational the process cannot terminate. Hence it defines an infinite 
sequence of integers 


Ag, By, Agress 
and as before 
ld , r 
T= [@, ai] = [ap, a1, az] =... [ao Ay; Aasens An Ocal 
t 
where An+1 = Any + > Ans 


t 
ania 


140 CONTINUED FRACTIONS [Chap. X 


Ons Pn +Pn-1 


Hence t = OS, 
An+1 latin- 


by (10.5.1), and so 


gun = Pn—-1>In—PnIn-1 = (—1}* f 
In Ialn In +n-1) In(On+1 Qn + In-1) 
gee l : >0 


i 
T E ET T 4 
Inlan dantan) In In+1 > n(n+1) 
when  -> œ. Thus 
x = lim Ph [a,, a, pa, By yor |, 
non 


and the algorithm leads to the continued fraction whose value is x, and 
which is unique by Theorem 169. 

TuzoreM 170. Every irrational number can be expressed in just one 
way as an infinite simple continued fraction. 


Incidentally we see that the value of an infinite simple continued 
fraction is necessarily irrational, since the algorithm would terminate 
if x were rational. 

We define In = An In-1t 4n- 
as in § 10.7. Repeating the argument of that section, we obtain 


Torem 171. The results of Theorems 163 and 164 hold also (except 
for the references to N) for infinite continued fractions. In particular 


(10.9.1) per eo 
dn An In+1 an 
10.10. A lemma. We shall need the theorem which follows in 
§ 10.11. 
_ PEFR 
Torm 172. If Brag 


where % > 1 and P, Q, R, and S are integers such that 
Q>8>0, PS-QR = +1, 
then R/S and P/Q are two consecutive convergents to the simple continued 
fraction whose value is x. If R/S is the (n- 1)th convergent, and P/Q the 
n-th, fhen { is the (n+1)th complete quotient. 
We can develop P/Q in a simple continued fraction 
(10.10.1) Eile tine yy 


Q dn 
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After Theorem 158, we may suppose n odd or even as we please. We 
shall choose n SO that 
(10.10.2) PS-QR = +1 = (—1)-. 

Now (P, Q) = 1 and Q > 0, and p, and q, satisfy the same condi- 
tions. Hence (10.10.1) and (10.10.2) imply P = p,, Q = gn, and 


Pa S-q, R = PS-QR = (—1)""! = Padn-1-Pn-1 


or 

1 
(10.10.3) Pn(S—In-1) = In( R—Pn-r)- 
Since (Pa, ga) = 1, (10.10.3) implies 
(10.10.4) Qn (S—n-1)- 
But 4a =Q>8 >O, gdr2>Qn-1 >9, 
and so S—Gria <n 


and this is inconsistent with (10.10.4) unless S—q,,_, = 0. Hence 
S = Qa R = Pai 


Pn o+Pn-1 
and g = Hol 

Gn$TIn-1 
or L= [ao t Bp, E]. 


If we develop { as a simple continued fraction, we obtain 
G =fanin Anto] 
where a,4, = [¢] > 1. Hence 
£ = [ao 4 Ap Ona) Ant) 
a simple continued fraction. But p,_,/¢,-, and Pp/4n, that is R/S and 
P/Q, are consecutive convergents of this continued fraction, and ¢ is 
its (n+ 1)th complete quotient. 


10.11. Equivalent numbers. If £ and » are two numbers such that 


where a, b, c, d are integers such that ad — bc = + 1, then £ is said to 
be equivulent to ņ. In particular, ¢ is equivalent to itself. 
If é is equivalent to y, then 


N) = — > (—d)(—a)—be = ad-be = fl, 
and SO 7 is equivalent to £. Thus the relation of equivalence is sym- 


metrical. 
fa=d=1b=c=0. 
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Tueorem 173. Iféand y are equivalent, and y and ý are equivalent, 
then é and ¢ are equivalent. 


__ ante oe 
For oF Gag ad be = +1, 
_ at+o' 1y R 
1 = yia a'd'—b'c’ = fl, 
_ ACLB 
and ¿é= FD’ 
where 


A = aa’ +bc’, B = ab’+bd’, C = ca’+de’, D = cb’+dd’, 
AD-BC = (au-be)(a’d’-b’c’) = +1, 

We may also express Theorem 173 by saying that the relation of 
equivalence is transitive. The theorem enables us to arrange irrationals 
in classes of equivalent irrationals. 

If h and k are coprime integers, then, by Theorem 25, there are in- 
tegers h’ and k’ such that 

hkh'—h'k = 1; 
h_h’.0+h _a.0+b 
k kok ¢.O+d’ 
with ad-bc = — 1. Hence any rational h/k is equivalent to 0, and 
therefore, by Theorem 173, to any other rational. 


and then 


THEOREM 174. Any two rational numbers are equivalent. 


In what follows we confine our attention to irrational numbers, 
represented by infinite continued fractions. 


THEOREM 175. Two irrational numbers & and qare equivalent if and 
only if 
(10.11.1) € = [ao a; s m Cos Crs Case], 0 = [bos bis-s Ons Cos Cis Core] 
the sequence of quotients in E after the m-th being the same as the sequence 
in y after the n-th. 

Suppose first that € and 7 are given by (10.11.1) and write 

u = [co Cy, Cy pene iF 
Then é = /a,, a, perg a,, w] = Pm OT Pn, 
Im w FIm-1 

and 2m Im-1—Pm-1 Im = 1,80 that £ and w are equivalent. Similarly, 
n and w are equivalent, and so ¿and ņ are equivalent. The condition is 
therefore sufficient. 
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On the other hand, if ¿and y are two equivalent numbers, we have 
_ a+b 

15 EFA 

We may suppose cé+d > 0, since otherwise we may replace the coeffi- 


cients by their negatives. When we develop é by the continued fraction 
algorithm, we obtain 


é= [a Aiseee Aks Ops] 


ad-bc = fl. 


, Pr-18k t Pr- 
= [aos Ap ag] = EET Sa 
[ons 1 k Ve-1%+Ix-9 
Pa,tRk 
Hence = 7 , 
i Qay + S 
where 


P = appt bgr- R = APp-2 tbk- 
Q = Pratik- S = Pr- tadgr-o 
so that P, Q, R, S are integers and 


PS-QR = (ad—be)(Pk-19r-2—Pr-24r-1) = E 1. 
By Theorem 171, 


ò 


ò 
Pra = Elk- H- > Pr- = Elk-2 + 
dk-1 dk-2 


where |8| < 1, |8’| < 1. Hence 


gerie e awa. 
k-1 Tk-2 


q 
Now cé+d > 0, qk-1> 4k- > 0, and 9-1 and 4p- tend to infinity; 
so that Q>S8>0 
for sufficiently large k. For such k 
ya PER 
QS’ 


where PS-QR = +1, Q>S >O, 6 = a > 1; 
and so, by Theorem 172, 


q= [bo Pye i] = [bo br Op er aki] 
for some by, b,,..., 6,. This proves the necessity of the condition. 


10.12. Periodic continued fractions. A periodic continued fraction 
is an infinite continued fraction in which 


a = Atk 
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for a fixed positive k and all 1 > L. The set of partial quotients 
Ap, Mayers AL+k-1 
is calletl the period, and the continued fraction may be written 
[ao Qio Oey Ap srs AL+k-1l. 
We shall be concerned only with simple perioclic continued fractions. 


TuroreM 176, A periodic continued fraction is a quadratic surd, i.e. 
an irrational rootof a quadratic equation with integral coefficients. 


If a, is the Lth complete quotient of the periodic continued fraction 


x, we have , 
> ar = (Ap, ario Ute AL Agy] 


La 
= [at to Oz 44-1 Ar], 
ne ” 
a’, — PILTP 
L 


KORTI 
(10.12.1) q'a? ("=p ar — p" =0, 
where p”/q" and p'/q' are the last two convergents to [a;, r+ 744-1): 
But q = Prati TPL- ar+Pra ai, = PL-2-PL-2 7 
dr-14L + Ir-2 Ir% — PL- 


If we substitute for a in (10.12.1), and clear of fractions, we obtain an 
equation 
(10.12.2) ax?+bx+c = 0 
with integral coefficients. Since x is irrational, b?—4ac Æ 0. 

The converse of the theorem is also true, but its proof is a little more 
difficult. 

TueorEM 177. The continued fraction which represents a quadratic 
surd 48 periodic. 

A quadratic surd gatisfies a quaclratic equation with integral coeffi- 
cients, which we may write in the form (10.12.2). If 

x = [6,8 yori a, peee | 

_ Pn1%mtPn-2, 
 InaFntIn-2” 
and if we substitute this in (10.12.2) we obtain 
(10.12.3) Antet Bpan tOn = 0, 


where 


3 


then x 


An = Ont ODn1 Int h-i . 
By = 2ap, 1Pn at 5(Pn1In 2 FPn-29n-1) + 269n 1 Un—2 
Cn = @Pn-2+ODn-2In-2+CFn—2 
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If A, = Opn t+ OP Ina+CIn—1 = 0, 
then (10.12.2) has the rational root p,_1/¢,-1, and this is impossible 
because x is irrational. Hence A, Æ 0 and 


A,y+B,y+C = 0 
is an equation one of whose roots is ap. A little calculation shows that 
(10.12.4) Bi—4A, Ch = (b?—4ac)(p, 1 Gn 2 Pn 2 In D? 


= b?—dac. 


By Theorem 17 1, 


Pn-1 = ån- pont “(IL < 1). 


n-i 
Hence 
2 ô 
A, = of a) +n z a 
-1 Ak 
= (ax®+bae+c)q?_,+2ax8,,_ pacha She hb 53 
q3- 1 
= 2axð,„_ ahs = bôn- 
and m < 2laz|+ļa|+ |b]. 


Next, since C, = Áp- 
\C, | < 2|ax|+|a]-+ |B]. 
Finally, by (10.12.4), 
B? < 4|A,,C,,|+|b?—4ae| 
< 4(2|ax|+|a|-+|b|)?+|b?—4ac]. 


Hence the absolute values of A,, B,, and C, are less than numbers 
independent of n. 

It follows that there are only a finite number of different triplets 
(A,, B,,C,,); and we can find a triplet (A, B, C) which occurs at least 
three times, say’ as (A,,, Bno Cn) (Ange Buy Cn), and (An Bno Ca) 
Hence ap, Qn, n, are all roots of 


Ny? 


and at least two of them must be equal. But if, for example, An, = ün,» 


then 
an, = âna an = ên 


and the continued fraction is periodic. 
5591 L 
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10.13. Some special quadratic surds. It is easy to find the 
continued fraction for a special surd such as y2 or ¥3 by carrying out 
the algorithm of § 10.6 until it recurs. Thus 


1 1 
10.13.1 V2 = 1+(V2—-1) = 14-—— = pa 
+ ) tpi l+) 
1 1 1 1 
== —— = = — oC = 2 
1+7 N2+1 krag ZF.. [1,2], 
and, similarly, 
l1 1 1 J 
10.13.2 = ite ees SS 
cutie alt Teta gee MA 
1 l ; 
i l sp eana ia [2,4] 
1 1 1 1 
1913.4 = ee eS i, 1, 1,4]. 
( ) N7 oer ge ee ae [2,i, 1, 1,4] 
But the most interesting special continued fractions are not usually 
‘pure’ surds. 
A particular simple type is 
(1613.5) pe be ol E 


a+ b+ at b+... 
where a) b, so that b = ac, where c is an integer. In this case 


1 1 (ab+1l)xz+b 


ee Gs agg 
(10.13.6) x?—ba—c = 0, 
(10.13.7) x = ${b+,)(b?+4c)}. 
In particular 

= l 1 ie WEST 
(10,13.8) a = DECOT = [il = 
(10.13.9) B= tts a= [2] = v2+1, 
(10.13.10) y= 2+ir = = [2, i] = W341. 


It will be observed that £ and y are equivalent, in the sense of § 10.11, 
to ¥2 and ¥3 respectively, but that « is not equivalent to v5, 
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It is easy to find a general formula for the convergents to (10.13.5). 
THEOREM 178. The (n+ 1)th convergent to (10.13.5) is given by 


(10.18.11) Dn = CHD ie, Gg = CMD, sare 
where 

a” — y” 
(10.13.12) Un = 

xX -Y 


and x and y are the roots of (10.13.6). 
In the first place 


b u 
qa = 1 = %4, gaan ca Stee 
+e (a#+y)?—z 
Po = b = x+y = Ud, Pı = ab+1 — + = flac La y — =, 


so that the formulae (10.13.11) are true for n = 0 and n = 1. We prove 
the general formulae by induction. 
We have to prove that 


Pn = enD to = Warn 


say. Now gente = barter”, ynte = by" +1+-cy”, 
and so 

(10.13.13) Upya = Dupy Fep 

But Uom = ©" Wem+2s Vomt1 = C'Wen+1: 


Substituting into (10.13.13), and distinguishing the cases of even and 
odd n, we find that 
Womt2e = 0Wam+1t+ Wom Wami = Want Wem 
Hence w,,, satisfies the same recurrence formulae as p,, and SO 
Dn = Wy+. Similarly we prove that {n= Wy41- 
The argument is naturally a little simpler when a = b, c = 1. In 
this case p, and g, satisfy 
Unt2 = buy yi Fun 
and are of the form Ax"-+ By", 
where A and B are independent of n and may be determined from the 
values of the first two convergents, We thus find that 
gntei_y nt2 grt —yntt 
Pn = jy qn = ey? 
in agreement with Theorem 178. 


t The power of cis c™ when n = 2m and ¢-™1! when n = 2m+ 1. 


148 CONTINUED FRACTIONS [Chap. X 


10.14. The series of Fibonacci and Lucas. In the special case 
a =b = | we have 


V5+1 1 V5—1 
10.14.1 d.a —5 > = —- = — , 
( ) 7 y . 3 
pth t2 +2 ntl — n+l 
Pn = Uny = ——, Qn = Una = i 


The series (u,) or 

(10.14.2) 1, 1, 2, 3, 5, 8, 13, 21,... 

in which the first two terms are u; and u, and each term after is the 
sum of the two preceding, is usually called Fibonacci’s series. There 


are, of course, similar series with other initial terms, the most interesting 
being the series (v,) or 


(10.14.3) 1, 3, 4, 7, 11, 18, 29, 47 ,... 
defined by 
(10.14.4) U, = "+y". 


Such series have been studied in great detail by Lucas and later writers, 
in particular D. H. Lehmer, and have very interesting arithmetical 
properties. We shall corne across the series (10.14.3) again in Ch. XV 
in connexion with the Mersenne numbers. 

We note here some arithmetical properties of these series, and parti- 
cularly of (10.14.2). 

Teom 179. The numbers u, and v defined by (10.14.2) and (10.14.3) 
have the following properties: 

(i) (Uns Un+1) = l, (Vps Uns) = 1; 

(ii) u, and v, are both odd or both even, and 

(Un Vn) = 1, (up, U,) = 2 
in these two cases; 
(iii) u, | Un for every r; 
(iv) if (m, n) = d then 
(tms Un) = Ug 
and, in particular, u,, and u, are coprime if m and n are coprime; 
(v) if (m,n) = 1, then 
Um Un | Umn' 

It is convenient to regard (10.13.12) and (10.14.4) as defining wu, and 

v, for all integral n. Then 
Uy = 0, Vy = 2 

and 


(10.14.5) un = — (xy) "u, = (— Hup Vig = (— Wn 
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We can verify at once that 


(10.14.6) 2Unin = Um Un tun Vm 
(10.14.7) v2—5uz = (—1)"4, 
(10.14.8) U2 —Un—y Uns, = (— 1)", 
(10.14.9) v0, 4 Una = (— 15. 


Proceeding to the proof of the theorem, we observe first that (i) 
follows from the recurrence formulae, or from (10,14.8), (10.14.9), and 
(10.14.7), and (ii) from (10.14.7). 

Next, suppose (iii) true for r = 1, 2 ,,.,, R-l. By (10.14.6), 


2U rn = Un UR—Dat UR vn Vn: 

If u„is odd, then u,, 2uprn and SO U, Ugn. If U, is even, then v,, is even 
by (ii), ùr-pn by hypothesis, and up_1), by Gi). Hence we may write 
URn = Un: 3U(R-1)n t+ Men ' Bn 

and again U, Upn: 

This proves (iii) for all positive r, The formulae (10.14.5) then show 
that it is also true for negative r. 

To prove (iv) we observe that, if (m,n) = cl, there are integers r, 8 
(positive or negative) for which 


rm+sn = d, 
and that 


(10.14.10) 2Ug = Um Ven tUen Vrm 
by (10.14.6). Hence, if (up u,) = h, we have 
h| Um Alun —> h| Um A| Uy —> h] Qug. 
If his odd, h u4. If his even, then u„ and u, are even, and SO tpm, 


Ugn? Vm Ven are all even, by Gi) and (iii). We may therefore write 


(10.14.10) as Ug = Up EY an) + Ubgn (EU pm)s 


and it follows as before that h ug. Thus h ugin any case. Also u4 u,, 
Ug Uns by Gii), and so Ug (Up Uy) = h. 
Hence h = ug, 
which is (iv). 
Finally, if (m, n) = 1, we have 
Um Umn Un Unn 
by .Gii), and (um u,,) = 1 by (iv). Hence 


Un Un, Un 
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In particular it follows from (iii) that u, can be prime only when m 
is 4 (when 4, = 3) or an odd prime p. But Up is not necessarily prime: 
ue Usa = 53316291173 = 953.55945741. 

Tuzorem 180. Every prime p divides some Fibonacci number (and 
therefore an infinity of the numbers). In particular 

Uy, = 0 (modp) 
if p = 5m+1, and Uni1 = 0 (modp) 
if p = 5m+2. 

Since ug = 2 and u; = 5, we may suppose that p 4 2, p + 5. It 
follows from (10.13.12) and (10.14.1) that 


n n 
10.14.11 2n-ly,, = B47 5+... 
(10.14.11) MeO e 


where the last term is 5#"-Vif n is odd and n. §!”-lif n is even. If n = p 


then 5 
gp-1 = 1, 5t@-) = — (modp), 
OP 


by Theorems 71 and 83; and the binomial coefficients are all divisible 
by p, except the last which is 1. Hence 


Up, = = +1 (mod p) 


5 
? @ 
and therefore, by (10.14.8), 
Up-1Upi1 = 0 (modp). 
Also (p-l,p+l) = 2, and so 
(up-i, Up 41) = Up = l, 
by Theorem 179 (iv). Hence one and only one of w,_, and upp is 
divisible by p. 
To distinguish the two cases, take n = p+l in (10.14.11). Then 


l 
Ups = (pts (Py Jë.. +p tise, 
Here all but the first and last coefficients are divisible by p,f and so 
5 
Pug = a (modp). 


Hence tou = (0) (modp) it 3 = -l, ie. if p= +2 (mod 5), and 


u 


et = 0 (modp) in the contrary case. 


We shall give another proof of Theorem 180 in § 15.4. 


t bs Y where 3 < Y < p— 1, is an integer, by Theorem 73 ; the numerator containg 
v 
p, and the denomiuator does not. t By Theorem 97. 
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10.15. Approximation by convergents. We conclude this chapter 
by proving some theorems whose importance will become clearer in 
Ch. XI. 


By Theorem 171, l 


G 
so that Palin provides a good approximation to x. The theorem which 
follows shows that ,/q, is the fraction, among all fractions of no greater 
complexity, i.e. all fractions whose denominator does not exceed q,, 
which provides the best approximation. 

THeoreM 181. Ifn >1,t0 <q <Q, and pja + Palin then 
Pn BP. iy 
In 

This is included in a stronger theorem, viz. 

Tueorem 182. Ifn > 1,0 <q <Q, and pla F Pr/Qn: then 
(10.15.2) [Pan x] < |p— az]. 

We may suppose that (p,q) = 1. Also, by Theorem 171, 

Pn — 4n z| < [Pn—1-In—1 4 

and it is sufficient to prove the theorem on the assumption that 
In-1 < 4 S In the complete theorem then following by induction. 

Suppose first that.qg = g,. Then 


Pn 


n 


z| < 


(10.15.1) v| < 


PnP 1 
fn Fly 
In nl ~ In 
: Pn 1 1 
if p # Pr. But E <r 
e In oe Yn an+ 2gp 
by Theorems 171 and 156; and therefore 
Pacala Ea, 
Vn n 


which is (10.15.2). 
Next suppose that 9,1 < 4 < qn, SO that p/q is not equal to either 
of Pn-1/ln-1 or Palle If we write 
EPnt¥Pn-1 = Pp, UIn+V9n—1: =f, 


+ We state Theorems 181 and 182 for ņ >1in order to avoid a trivial complication. 
The proof is valid for n = 1 unless q3 = gp} = 2, which is possible only if a, = a= 1. 
In this case 


z 11l l A.s 
zeati pare a Arh 
and ati < r< atl 


unless the fraction ends at the second 1. If this is not so then 7,/9, is nearer to x than any 
other integer. But in the exceptional case x = a)+4 there are two integers equidistant 
from x, and (10.15.1) may become an equality. 
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then E(PnIn—1—Pn-1 In) = Pln- Wn- 
so that u = +(P9n-1—-FPn-1)3 
and similarly V = (Pln — IPn). 


Hence p and y are integers and neither is zero. 
Since q = uqa +?Gn-1< Un» u and y must have opposite signs. By 
Theorem 17 1, 
Punan, Pn- fn- 
have opposite signs. Hence 
L(Pn— In x), v( Pn- — n- x) 
have the same sign. But 
P-P” = UPa — In T) HY Pn- in- 2) 
and therefore 
|p—qe| > |Pn-1—Yn—1 x| > Pun —Un Zl. 
Our next theorem gives a refinement on the inequality (10.9.1) of 
Theorem 17 1. 


THEOREM 183, Of any two consecutive convergents to x, one at least 
satisfies the inequality 


P l 
10.15.3 Peg) ee. 
, 3 i | S 29 
Since the convergents are alternately less and greater than x, we have 
(10.15.4) Pns Pn) _|Pn_ y| |Past g, 
Ini In In Unit 


If (10.15.3) were untrue for both p,/¢, and p,,,1/¢,4,, then (10.15.4) 
would imply 


1 = [a Qun — Pn an+ = Pn+1_ Pn A 1 
An In V In+1 Ini dn 7 297, 293a 
or (Uni hn) sS 0, 


which is false except in the special case 
n = 0, a, = 1, q = q =1. 
In this case 
Vets ly; at} na F 
so that the theorem is still true. 
It follows that, when x is irrational, there are an infinity of con- 


vergents p,,/qg, Which satisfy (10.15.3). Our last theorem in this chapter 
shows that this inequality iş characteristic of convergents. 
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THEOREM 184. If 


(10.15.5) Pel 


then p/q is a convergent. 
If (10.15.5) is true, then 


ees 
q @ 
where € = +1, 0<6< }. 


We can express p/q as a finite continued fraction 
[do As- An]; 


and since, by Theorem 158, we can make n odd or even at our discretion, 


we may suppose that €= (<1), 
We write t = Pat Pai, 
on tin- 


where Pp/lns Pn-1/9n-1 are the last and the last but one convergents to 
the continued fraction for p/q. Then 


e0 En _ gy— PrIn-1—Pn-1In _ a 


È m nF) In @Fn+En-1) 
dn = 
and so ey a 0, 
1 ~l 
Hence = bac >l 


(since o < 6 < 4); and so, by Theorem 172, Pp-1/4n-1 and pp/q, are 
consecutive convergents to ¥. But p,/¢, = p/q. 


NOTES ON CHARTER X 


§ 10.1. The best and most complete account of the theory of continued frac- 
tions is that in Perron’s Kettenbriiche; and many proofs in this and the next chapter 
are modelled on those given in this book or in the game writer’s Irrationalzahlen. 
The only extended treatment of the subject in English is in Chrystal’s Algebra, ii. 
Perron gives full references to the history of the subject. 

§ 10.12. Theorem 177 is Lagrange’s most famous contribution to the theory. 
The proof given here (Perron, Kettenbriiche, 77) is due to Charves. 

§§ 10.13-14. Therc is a large literature concerned with Fibonacci’s and similar 
series. See Bachmann, Niedere Zahlentheorie, ii, ch. ii; Dickson, History, i, ch. xvii; 
D. H. Lehmer, Annals of Math. (2), 31 (1930), 419-48. 


XI 
APPROXIMATION OF IRRATIONALS BY RATIONALS 


11.1. Statement of the problem. The problem considered in this 
chapter is that of the approximation of a given number €, usually 
irrational, by a _ rational fraction 


ra 


Q IB 


We suppose throughout that 0 < €< Land that p/q is irreducible.} 
Since the rationals are dense in the continuum, there are rationals as 
near as we please to any . Given ¢ and any positive number e, there is 
an r= p/q such that 
P 
\r—€ | = l-e 


any number can be approximated by a rational with any assigned degree 
of accuracy. We ask now how simply or, what is essentially the same 
thing, how rapidly can we approximate to ¢? Given é and e, how com- 
plex must p/qbe (i.e. how large q) to secure an approximation with the 
measure Of accuracy ¢€? Given é and q, or some upper bound for q, how 
small can we make e? 

We have already done something to answer these questions. We 
proved, for example, in Ch. IM (Theorem 36) that, given é and n, 


S «€; 


p 1 
Ipgq.0<q<n. |e |< 
o Pd q< i É < arn 
and a fortiori 
(11.1.1) P el <l; 
q q 


and in Ch. X we proved a number of similar theorems by the use of 
continued fractions. { The inequality (11.1. 1), or stronger inequalities 
of the same type, will recur continually throughout this chapter. 

When we consider (11.1.1) more closely, we find at once that we must 
distinguish two cases. 

(1) £ is a rational a/b. If r 4 £, then 

|bp—ag| _ 1 
= Se 

bq ~ bq 

so that (11.1.1) involves q < b. There are therefore only a finite number 
of solutions of (11.1.1). 


+ Except in § 11.12. f See Theorems 171 and 183. 


Jp g — |P_ @ 
(11.1.2) lr—é€| £ 5 
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(2) £ is irrational. Then there are an infinity of solutions of (11.1.1). 
For, if p,/q, is any one of the convergents to the continued fraction 
to €, then, by Theorem 171, 


and p,/4, is a solution. 


Tueorem 185. If é is irrational, then there is an infinity of fractions 
lq which sutisfy (11.1.1). 

In § 11.3 we shall give an alternative proof, independent of the theory 
of continued fractions. 


11.2. Generalities concerning the problem. We can regard our 
problem from two different points of view. We suppose £ irrational. 


(1) We may think first of e. Given é, for what functions 
p= ofe, 3) 


€ 
is it true that 


(11.2.1) Jp,q.q SÒ. 2-a <e, 


for the given é and every positive ¢ ? Or for what functions 


+f] 


independent of é, is (11.2.1) true for every é and every positive e ? It 
is plain that any ® with these properties must tend to infinity when 
e tends to zero, but the more slowly it does so the better. 

There are certainly sotie functions ® which have the properties 
required. Thus we may take 


l 
= Ba ie 
and q = ®. There is then a p for which 
P_ a<} 
q é < 2g Ga 


and So this ® satisfies our requirements. The problem remains of find- 
ing, if possible, more advantageous forms of ®. 


(2) We may think first of q. Given &, for what functions 
$ = (é 9), 
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tending to infinity with q, is it true that 
(11.2.2) Ap. z-e 


Or for what functions $= $l), 


independent of £, is (112.2) true for every é ? Here, naturally, the 
larger ġ the better. If we put the question in its second and stronger 
form, it is substantially the same as the second form of question (1). 
If gis the function inverse to ®, it is substantially the game thing to 
assert that (112.1) is true (with @ independent of £) or that (11 .2.2) 
is true for all € and q. 

These questions, however, are not the questions most interesting to 
us now. We are not so much interested in approximations to € with 
an arbitrary denominator g, as in approximations with an appropriately 
selected q. For example, there is no great interest in approximations 
to 7 with denominator 11; what is interesting is that two particular 
denominators, 7 and 113, give the very striking approximations # and 
#33, We should ask, not how closely we can approximate to with an 
arbitrary q, but how closely we can approximate for an infinity of 
values of q. 

We shall therefore be occupied, throughout the rest of this chapter, 
with the following problem: for what ¢ = $(€,q@), or $= (q), is it true, 
for a given &, or for all £, or for all € of some interesting class, that 
p 1 
q j $ 
for an infinity of q and appropriate p? We know already, after Theorem 
171, that we can take ¢ = q? for all irrational £. 


11.3. An argument of D irichlet. In this section we prove Theorem 
185 by a method independent of the theory of continued fractions. 
The method gives nothing new, but is of great importance because it 
can be extended to multi-dimensional problems.+ 

We have already defined [x], the greatest integer in x. We define 
(x) by (x) = «—[2]; 
and as the difference between x and the nearest integer, with the 
convention that ¢ = 4 when x is n+4. Thus 

[3] = 1, (3) = A 3 Ea =}. 
Suppose ë and e given. Then the Q+ 1 numbers 


f See § 11.12. 


l 
<3! 


(11.2.3) < 
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define Q+ 1 points distributed among the Q intervals or ‘boxes’ 


<< (s = 0, 1, QD. 


There must be one box which contains at least two points, and there- 
fore two numbers q, and q,, not greater than Q, such that (¢,¢) and 
(92£) differ by less than 1/Q. If q, is the greater, and q = q,.—q;, then 
0 <q <Q and |gé| < 1/Q. There ig therefore a p such that 


< 
lgé—p| 7 
Hence, taking Q = H +1, 
€ 
we obtain apa. a< fiH E-4<: 
€ q q 
(which is nearly the same as the result of Theorem 36) and 
Se, 

(11.3.1) E— il< — 

i ns 


which is (11.1.1). 
If ¿ is rational, then there is only a finite number of solutions.t We 
have to prove that there is an infinity when & is irrational. Suppose that 


Pi _Po | Pe 
a aT 
exhaust the solutions. Since £ is irrational, there is a Q such that 
Ps l 
Ee cE (s= 1,2,..., K 
qs Q AFEN 
But then the p/q of (11.3.1) satisfies 
p é 1 1 
-—é|<—= <7) 
q gR 


and is not one of ,/g,; a contradiction. Hence the number of solutions 
of (11.1.1) is infinite. 


Dirichlet’s argument proves that gf is nearly an intrger, SO that (af ) is nearly 
0 or 1, but does not distinguish botween these cases, The argument of § 11.1 
gives rather more: for (—1)"-1 


Pele 
Gn £= 


5 
is positive or negative according as 7 is odd or even, and Iné is alternately a 
little Jess and a little greater than p,. 


f 
Anan 
w 


t The proof of this in §1].] wag independent of continued fractions. 
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11.4, Orders of approximation. We shall say that ¢ is approxim- 
able by rationals to order n if there is a K(€), depending only on £, for 
which 


(11.4.1) 2- aKO 
q q 


has an infinity of solutions. 

We can dismiss the trivial case in which £ is rational. If we look back 
at (11.1.2), and observe that the equation bp-aq = 1 has an infinity 
of solutions, we obtain 


THEOREM 186. A rational is approximable to order 1, and to no higher 
order. 


We may therefore suppose é irrational. After Theorem 171, we have 
Tuzorem 187. Any irrational is approximable to order 2. 


We can go farther when é is a quadratic surd (i.e. the root of a 
quadratic equation with integral coefficients). We shall sometimes 
describe such a é aS a quadratic irrational, or simply as ‘quadratic’. 


Torm 188. A quadratic irrational is approximable to order 2 and 
to no higher order. 


The continued fraction for a quadratic £ is periodic, by Theorem 177. 
In particular its quotients are bounded, so that 


0<a, < M, 
where M depends only on é. Hence, by (10.5.2), 
Grit = Ans Int Int < Ania H Vn FIn- < (M+2)¢n 
and a fortiori n1 < (Mf+2)9,. Similarly 9, < (M+2)¢n-1. 


Suppose now that Wn-1 < 4 S Uy 
Then q, < (M-+2)q and, by Theorem 181, 
qa |7 [an Inda (M+2)g2 ~ (M42)P qn 7 ¢ 


where K = (M+ 2)-3; and this proves the theorem. 

The negative half of Theorem 188 is a special case of a theorem 
(Theorem 191) which we shall prove in § 11.7 without the use of con- 
tinued fractions. This requires some preliminary explanations and some 
new definitions. 
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11.5. Algebraic and transcendental numbers. An algebraic 
number is a number x which satisfies an algebraic equation, i.e. an 
equation 
(11.51) aot” +a z”... Fap = 0, 
where a,, a,.. are integers, not all zero. 

A number which is not algebraic is called transcendental. 

If x = ajb, then bx-a = 0, so that any rational x is algebraic. Any 
quadratic surd is algebraic; thus i =: ./(— 1) is algebraic. But in this 
chapter we are concerned with real algebraic numbers. 

An algebraic number satisfies any number of algebraic equations of 
different degrees; thus x = V2 satisfies z?—2 = 0, 2!—4= O,... . If x 
satisfies an algebraic equation of degree n, but none of lower degree, 
then we say that x is of degree n. Thus a rational is of degree 1. 

A number is Euclidean if it measures a length which can be con- 
structed, starting from a given unit length, by a Euclidean construction, 
i.e. a finite construction with ruler and compasses only. Thus ¥2 is 
Kuclidean. It is plain that we can construct any finite combination of 
real quadratic surds, such as 


(11.5.2) (114-2V7)—,/(11—2V7) 


by Euclidean methods. We may describe such a number as of real 
quadratic type. 

Conversely, any Euclidean construction depends upon a series of 
points defined as intersections of lines and circles. The coordinates 
of each point in turn are defined by two equations of the types 

la+my+n = 0 

or x+y? + 2ga+-2fyte = 0, 
where l, m,n, g, f,c are measures of lengths already constructed; and 
two such equations define x and y as real quadratic combinations of 
l, m,.... Hence every Euclidean number is of real quadratic type. 

The number (11.5.2) is defined by 

x = y-Z, y? = 11+2t, z = 11-2t, ĉ=7 

and we obtain zt—44r? +112 = 0 
on eliminating y, z, and t Thus x is algebraic. It is not difficult to 
prove that any Euclidean number is algebraic, but the proof demands 
a little knowledge of the general theory of algebraic numbers.t 

+ In fact any number defined by an equation oa £” +o, gilt ..ta, = 0, where 


ap Apes % are algebraic, jg algebraic. For the proof see Hecke 66, or Hardy, Pure 
mathematics (ed. 9, 1944), 39. 
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11.6. The existence of transcendental numbers. It is not imme- 
diately obvious that there are any transcendental numbers, though 
actually, as we shall see in a moment, almost all real numbers are 
transcendental. 

We may distinguish three different problems. The first is that of 
proving the existence of transcendental numbers (without necessarily 
producing a specimen). The second is that of giving an example of 
a transcendental number by a construction specially designed for the 
purpose, The third, which is much more difficult, is that of proving 
that some number given independently, some one of the ‘natural’ 
numbers of analysis, such as e or 7, is transcendental. 

We may define the rank of the equation (11.5.1) as 


N = n+ ao|+|a,|-+...+]a@,|. 
The minimum value of Ņ is 2. It is plain that there are only a finite 


number of equations E.E E 
N, b ËN, D» 0o SN ky 


of rank N. We can arrange the equations in the sequence 

Ezi Boos) Eor Esp Eseo Eapro Eare 
and so correlate them with the numbers 1, 2, 3,... . Hence the aggregate 
of equations is enumerable. But every algebraic number corresponds 
to at least one of these equations, and the number of algebraic numbers 
corresponding to any equation is finite. Hence 


Tuzorem 189. The uggregate of ulgebraic numbers ¿s enumeruble. 


In particular, the aggregate of real algebraic numbers has measure 
zero. 


Torm 190. Almost qall real numbers are trunscendentul. 


Cantor, who had not the more modern concept of measure, arranged his proof 
of the existence of transcendental numbers differently. After Theorem 189, it is 
enough to prove that the continuum O < % < l is not enumerable. We reprenent 


x by its decimal K = Q agaz.. 


(9 being excluded, as in § 9.1). Suppose that the continuum is enumerable, as 
Hy, La, Xz yey and let 


Hy = ‘Ajr M219... 
Te = + Ae, Agg Agg o 
Xa = ‘azı Aza as, one 


If now we define a, by 

aq = amt 1 (if Apn is neither 8 nor 9), 

ay =O (if dan is 8 or 9), 
then a, fa,, for any n; and # cannot be an} of £i, £a... since its decimal 
differs from that of any Tp in the nth digit. This ig a contradiction. 
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11.7. Liouville’s theorem and the construction of transcen- 
dental numbers. Liouville proved a theorem which enables us to 
produce as many examples of transcendental numbers as we please. It 
is the generalization to algebraic numbers of any degree of the negative 
half of Theorem 188. 

TuEoreM 191. A real algebraic number of degree n is not approximable 
to any order greater than n. 


An algebraic number & satisfies an equation 
fA = aér +a Er... Han = 0 
with integral coefficients. There is a number M(é) such that 
(11.7.1) FE] <M (€-1<a<€+)). 
Suppose now that p/q + ¿is an approximation to £, We may assume 


the approximation close enough to ensure. that p/q lies in (£—1, é+ 1), 


and is nearer to € than any other root of f(x) = 0, so that f(p li #0. 
Then 


P\|_ lap" +a prt. 1 

(11.7.2) (ZL 0 1 D.a 
| q q” Z 

since the numerator is a positive integer; and 

p P P , 
(11.7.3) (4) =/{)- é) = (F—¢) x), 

d ; Fé) j F(x) 
where x lies between p/q and £. It follows from (11.7.2) and (11.7.3) that 


p_.|_ Aed 1 _K 
q F ~ Mgr gr 
so that ¿is not approximable to any order higher than n. 

The cases n = 1 and n = 2 are covered by Theorems 186 and 188. 
These theorems, of course, included a positive as well as a negative 
statement. 


(a) Suppose, for example, that 
é= -110001000,.. = 10-1!4+.10-?!4.10-3!+ ...,, 


that n > N, and that &, is the sum of the first n terms of the series. 
Then 


= Toni = g 
say. Also 
0< = = En OO One pS 2, 100D! < 2g, 


5591 M 
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Hence £ is not an algebraic number of degree less than N. Since N is 
arbitrary, € is transcendental. 

(6) Suppose that 

] ] ] 
10+ 10? 


that n >N, and that 7 — Pr 


é= 


the nth convergent to é, Then 
p e| o ıl 1 1 


no Ge e 
Inin Antila nt 


Now @,4, = 10! and 


qı < a+, Aa = ant < an+? (n > D; 
so that . 
< (ay +1)(42+1)...(a, +1) 


1 1 1 
< ( Ha +l +755} tean 


< 2, Mg... % = 2100+! < 19%") = gf, 


rei 1 a al eae 
apa aR aR S N 


We conclude, as ee that é is transcendental. 


THEOREM 192. The numbers 


é= 10-U+10-7+ 10-3 +... 
and é EARNE ee 
= 10U+ 1074 1084... 


are transcendental. 

It is plain that we could replace 10 by other integers, and vary the 
construction in many other ways. The general principle of the construc- 
tion is simply that a number dejined by a sufficiently rapid sequence of 
rational approximations is necessarily transcendental. It is the simplest 
irrationals, such as ¥2 or 4(v5—1), which are the least rapiclly ap- 
proximable. 

It is much more difficult to prove that a number given ‘naturally’ is 
transcendental, We shall prove e and y transcendental in §§ 11.13-14. 
Few classes of transcendental numbers are known even now. These 
classes include, for example, the numbers 


e, 7, sin 1, J(1), log os, e", 272 
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but not 2¢, 27, n°, or Euler’s constant y. It has neyer been proved even 
that any of these last numbers are irrational. 


11.8. The measure of the closest approximations to an arbitrary 
irrational. We know that every irrational has an infinity of approxi- 
mations satisfying (11.1. 1), and indeed, after Theorem 183 of Ch. X, 
of rather better approximations. We know also that an algebraic. 
number, which is an irrational of a comparatively simple type, cannot 
be ‘too rapidly’ approximable, while the transcendental numbers of 
Theorem 192 have approximations of abnormal rapidity. 

The best approximations to é are given, after Theorem 181, by the 
convergents Palan of the continued fraction for £; and 

Pn l 1 

aaa é E S ~——~3? 

In In In+1 Ani Qn 
so that we get a particularly good approximation when a,,, is large. It 
is plain that, to put the matter roughly, é will or will not be rapidly 
approximable according as its continued fraction does or does not 
contain a sequence of rapidly increasing quotients. The second & of 
Theorem 192, whose quotients increase with great rapidity, is a particu- 
larly instructive example. 

One may say, again very roughly, that the structure of the continued 
fraction for £ affords a measure of the ‘simplicity’ or ‘complexity’ of £. 
Thus the second é of Theorem 192 is a ‘complicated’ number. On the 
other hand, if a, behaves regularly, and does not become too large, then 
é may reasonably be regarded as a ‘simple’ number; and in this case 


the rational approximations to é cannot be too good, From the point 
of view of rational approximation j the simplest numbers _are_the_worst } 
The ‘simplest’ of all irrationals, from this point of view, is the number 


1 1 l 
(11.8.1) é = 4(v5—1) = ; 
EIE IA 
in which every a, has the smallest possible value. The convergents to 
this fraction are o1 1'2 3 6. 
PL2 3 p 8°": 


-so that q„-1 = Ph and gn-1_ Pn, g, 
qn 


Im, 
Hence 
bag So a 1 
Un z In dna+ IL +E tana 
EEN daa L l l 
5 alt+s) E 1H v5 


when n -> œ. 
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These considerations suggest the truth of the following theorem. 

TuEorEM 193. Any irrational £ has an infinity of approximations which 
satisfy 

l 
A 

The proof of this theorem requires some further analysis of the 
approximations given by the convergents to the continued fraction. 
This we give in the next section, but we prove first a complement to 
the theorem which shows that it is in a certain sense a ‘best possible’ 
theorem. 


Teorem 194. In Theorem 193, the number v5 is the best possible 
number: the theorem would become false If any larger number were substi- 
tuted for v5, 


It is enough to show that, if A > v5, and £ is the particular number 
(11.8.1), then the inequality 


(11.8.2) pe < 


l 
FF <La 
has only a finite number of solutions. 


Suppose the contrary. Then there are infinitely many q and p such 


that 
Pe Be gos 
g ar b< 3<% 
Hence 8 
gue g = —h—p, 


52 
Qn NSV = Pe 
The left-hand side is numerically less than 1 when q is large, while the 


right-hand side is integral. Hence p?+-pq—gq? = 0 or (2p+q)*? = 5q?, 
which is plainly impossible. 


11.9. Another theorem concerning the convergents to a con- 
tinued fraction. Our main object in this section is to prove 


Teorem 195. Of any three consecutive convergents to £, one at least 
satisfies (11.8.2). 


This theorem should be compared with Theorem 183 of Ch. X. 
We write 


(11.9.1) lai bair 
Gn 
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1 ] l 
Then E nl E, 
| In ngns Ta Unstone 
and it is enough to prove that 
(11.9.2) a;+b; < v5 


cannot be true for the three values n-l, n, n+l of i. 
Suppose that (11.92) is true for 4 = n- 1 and 7 = n. We have 


l 
Oy — a,-, pan 
= ta 
and 
(11.9.3) : = In- = an- tbn- 
n gqn-2 
1 l ' 
Hence =+ = a 4+b,. < V5, 
Ay On’ 
god 1 
ad EN ee (v5—b,)(v5—5 } 
On b 
Se hl ea 
mT by D 


Equality is excluded, since 6, is rational, and 6, < 1. Hence 
b2—b, v6+1<0, (4v5—b,)* < 4, 


(11.9.4) b, > 3(V5—1). 
If (11.9.2) were true also for 1 == n+ 1, we could prove similarly that 
(11.9.5) Baar > 4(V5—1); 
and (11.9.3),} (11.9.4), and (11.9.6) would give 
ay = tb, < HVE+I—HvS—]) = 1, 


Pn +1 
a contradiction. This proves Theorem 195, and Theorem 193 is a 
corollary. 


11.10. Continued fractions with bounded quotients. The number 
5 has a special status, in Theorems 193 and 195, which depends upon 
the particular properties of the number (11.8.1). For this £, every a, is 
1; for a € equivalent to this one, in the sense of § 10.11, every a, from 
a certain point is 1; but, for any other ¢, a, is at least 2 for infinitely 
manyn. It is natural to suppose that, if we excluded é equivalent to 
(11.8.1), the ¥5 of Theorem 193 could be replaced by some larger 


{With n+ 1 for n, 
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number; and this is actually true. Any irrational ¢ not equivalent to 
(11.8.1) has an infinity of rational approximations for which 
p l 
q ‘ DEZA 
There are other numbers besides y5 and 242 which play a special part 
in problems of this character, but we cannot discuss these problems 
further here. 

If a, is not bounded, i.e. if 
(11.10.1) lima, = 00, 

ND 

then g/,,,/¢, assumes arbitrarily large values, and 


(11.10.2) P_ a <É 

q g 
for every positive « and an infinity of p and q. Our next theorem shows 
that this is the general case, since (11.10.1) is true for ‘almost all’ £ in 
the sense of § 9.10. 

TuEorEM 196. a, is unbounded for almost all £; the set of [for which 
a, is bounded is null, 

We may confine our attention to é of (0, 1), SO that a, = O, and to 
irrational é, since the set of rationals is null. It is enough to show that 
the set F, of irrational ¢ for which 
(11.10.3) On Sk 
is null; for the set for which a, is bounded is the sum of F}, Fy, F...- 

We denote by Eg, an0, 


the set of irrational ¢ for which the first n quotients have given values 
Qis Qo... a, The set Ea, lies in the interval 


l 1 
GFP ay 
which we call J,,. The set E, ,, lies in 
11 1 1 


ata, > a+ 4.41’ 


which we call 7 Generally, E,, ä 


Qy G2" 
whose end points are 


[a ago an- antl], [t 2- An- an] 
(the first being the left-hand end point when n is odd). The intervals 
corresponding to different sets a,, @..., a, are mutually exclusive 


„ a, ies in the interval J, 


be $ a ben ln 
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(except that they may have end points in common), the choice of a,,, 
dividing up n,a 3. a nto exclusive intervals. Thus Z is the 


Ay, AE yor On 
sum of I 
üy Ay, ©, 1 Ay, Bayer; Any 29°88 * 


The end points of I iiin can also be expressed as 


ra On 
(an t1)Pn-1HPn-2. On Pn- t Pn. 
(an+ ln- tin- An Gn-1t+In-2 

and its length (for which we use the same symbol as for the interval) is 


1 1 
{(Qa+1)dn-1+In—2}(On In—1+In-2) z (Qn +9n-1)%n- 
l 
IL = ——. 
Thus ai ia, tla, 
We denote by Boag t,t; k 


the sub-set of Hy. ,..,a, for which p+, < k. The set is the sum of 
Eq, Mines CL, ant, (a,,, = 1, 2 yeeey k). 
The last set lies in the interval J, ay., an anų Whose end points are 
[a Day.) Any änt t 1], [a losses Ans anil; 
and 80 En, as, an; x Hes in the interval J, as... an; 


[y, Ao- Ons k+1], [as Qop l], 
(k-+-L) Pp +Pn—1 Pan tPn-i y 


k Whose end points are 


H (k+ Vdn+@n-1° Intin- 
The length of Ja, a 3.. a,; «18 
k . 
{(k+ 1 Qn +4n—}(In +4n-1) , 
and 
(1.10.4) dan, arns anik kg, k 


< ’ 
Lan anna. = (kKAL Qn tin- k+1 
for all a,, d,,..., a,. 
Finally, we denote by 


IM = So II 
k 1, Aare Ai 
E ot ea eet 


the sum of the J, ,,, q, for which a, <k,...,a, <k; and by Fy” the set of 
irrational ¢ for which a, < k,..., a, < k. Plainly F{Ẹ® is included in J. 
First, I{P is the sum of J,, for a, = 1, 2,..., k, and 
k 


D E eee 
Pe rum IT 


Q= 


Il 
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Generally, ("+ Ð is the sum of the parts of the J, a tye, ay’ included in 
IW, for which a,,, < k, i.e. is 
L, Ady, 0; ke 


HSK yy An Sk 


Hence, by (11.10.4), 


pint) < k L 2 kyr n); 
k a: ae ty Atr. OL. +1 
and so Dp ge fa) 
FI 
It follows that F{™ can be included in a set of intervals of length less 
than k \n 
(ea) 


which tends to zero when n >œ. Since F, is part of FẸ for every n, 
the theorem follows. 

It is possible to prove a good deal more by the same kind of argument. 
Thus Borel and F. Bernstein proved 


THEOREM 197*, If $in) is an increasing function of n for which 


l 
(11.10.5) eee 
> $(n) 
is divergent, then the set of é fOr which 
(11.10.6) On < $in), 


for all suficiently large n, is null. On the other hand, if 


(1.10.7) > im 


is convergent, then (11.10.6) is true for almost all é and sufficiently large n. 
Theorem 196 is the special case of this theorem in which ¢(n) is a 
constant. The proof of the general theorem is naturally a little more 
complex, but does not involve any essentially new idea. 
11 .11, Further theorems concerning approximation. Let us suppose, to 


fix our ideas, that a, tends steadily, fairly regularly, and not too rapidly, to 


aoe! — ~ = ’ 

dn dn dni Anyi a an Xlln) 

where Xan) = anyin 

There is a certain correspondence between the behaviour, in, respect of con- 
vergence or divergence, of the series} 


Sa Uh 
= x(v) at Xll) 
t The idea is that underlying ‘Cauchy’s condensation test’ for the convergence or 


divergence of a geries of decreasing positive terms, See Hardy, Pure mathematics, 9th 
ed., 354. 


x 
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ee l 
and the latter series 1s z 


Bai 
These rough considerations suggest that, if we compare the inequalities 
(11.11.1) an < $in) 
and 
(11.11.2) |2- < ates 
q axla) 


there should be a certain correspondence between conditions on the two series 


Sa Zo 


And the theorems of § 11.10 then suggest the two which follow. 


THEOREM 198. If Daa 


is convergent, then the set of & which satisfy (11.11.2) for an infinity of q is null, 


THEOREM 199*, If x(q)/q increases with q, and 


1 
> x(q) 
ts divergent, then (11.11.2) tg true, for an infinity of q, for almost all E. 


Theorem 199 is difficult. But Theorem 198 is very easy, and can be proved 
without continued fractions. It shows, roughly, that most irrationals cannot be 
approximated by rationals with an error of order much less than q, e.g. with 
an error 


Olaia 


The more difficult theorem shows that approximation to guch orders as 


o), R ere 
g logg g° log qloglog q 
is usually possible. 
We may suppose 0 < é < 1. We enclose every piq for which q > N in an 


interval P 1 p 1 


: ps 3 ~m ci 
q ax) q axla) 
There are less than q values of p corresponding to a given q, and the total length 
of the intervals is less (even without allowance for overlapping) than 
< l 
2 > m 
<a xla) 


which tends to 0 when N > œ. Any E which has the property is included in an 
interval, whatever be N, and the set of é can therefore be included in a set of 
intervals whose total length is as small as we please. 


11.12. Simultaneous approximation. So far we have been con- 
cerned with approximations to a single irrational ¢. Dirichlets argument 
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of § 11.3 has an important application to a multi-dimensional problem, 
that of the simultaneous approximation of k numbers 


éi» Esisi Ek 
by fractions Pi, Pa a Pe 
4 q4 q 


with the same denominator q (but not necessarily irreducible). 


Torm 200. If &, &,..., &, are any real numbers, then the system of 
inequulities 


(11.12.1) j- 
q 


i 


l Eo 
< qe (e = p i= 12a Bl 


has at least one solution. If one £ at least is irrational, then it has an 
infinity of solutions. 


We may plainly suppose that 0 < é; < 1 for every i. We consider 
the k-dimensional ‘cube’ defined by 0 < x; < 1, and divide it into Q* 
‘boxes’ by drawing ‘planes’ parallel to its faces at distances 1/Q. Of 
the Q*-+-1 points 

(léi), (Ea), (1&,) (l= 0,1, 2,..., 2), 
some two, corresponding say to | = q; and l = q, > q, must lie in the 
same box. Hence, taking q = q,—q,, as in § 11.3, there is a q < Q* 
such that __ 1 l 
léi] < Q < gt 
for every i. 

The proof may be completed as before; if a £, say é, is irrational, then 
é; may be substituted for £ in the final argument of § 11.3. 

In particular we have 


Tuzorem 201. Given £, é, ,..., &,and any positive e, we can find an 
integer q SO that qé; differs from an integer, for every i, by less than e. 


11.13. The transcendence Of e. We conclude this chapter by 
proving that e and 7 are transcendental. 

Our work will be considerably simplified by the introduction of a 
symbol A", which we define by 


k =1, Wr= rl (r >). 
If f(S) is any polynomial in x of degree m, say 


fe) = Sea", 


then we define f (h) as $ c,h" = Ş cr! 
r=0 r=0 
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(where O! is to be interpreted as 1). Finally we define f(x-+A) in the 
manner suggested by Taylor’s theorem, viz. as 


pe W= $ fola). 
= r: r=0 
If f(x-+y) = Fy), then f(x+h) = Fh). 
We define u,(x) and ¢,(x), for r = 0, 1, 2 p by 


x2 


x É = ell 
u, (x) = ——-+ -—— we = @ €,(2). 
ri (Ely ea) 
It is obvious that |u,(«)| < el"), and so 
(11.13.1) le(z)| < 1, 


for all 2%. 
We require two lemmas. 


Torm 202. If d(x) is any polynomial and 


(1.13.2) ge) = Sear, Wz) = $c, e,(2)a" 
r=0 r= 
then 
(11.13.83) eth(h) = b(a-+h) +(x). 
By our definitions.above we have 
(x+thy = Ppa tT) Nath A. pat 
= rltr(r—Dla+ i ed 2)ta®+ tar 
x x 

= ri(1424+5j+ 2) 

= r! e” —u,(x)ja = efh —u,(x) a". 
Hence eh = (x+hf+u drj = (4h +e eleje. 


Multiplying this throughout by ¢,, and summing, we obtain (11.13.3). 
As in § 7.2, we call a polynomial in x, or in x, y, . . . . whose coefficients 
are integers, an integral polynomial in x, or %, y,.. 


Torm 203. If m > 2, f(xlis an integral polynomial in x, and 
gm m 
punea F, Ey £), 


then F’(h), F!(h) are integers and 
Fh) = f(0), B!!(h) = 0 (modm). 


F(z) ma 
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L 
Suppose that f(z) = > a2, 
1=0 


where 4,,,..., a; are integers. Then 


£ (m-l!) 
L 
and so Fh) = P onama 
ær (m—1)! 
But Ose raat E E 


(m—1)! 


is an integral multiple of m if / > 1; and therefore 


F,(h) =a, = f(0) (mod m). 


L mM 

Re) = > a =, 
1=0 : 
L 


Similarly 


(+m)! 
a ttm = 


=0 (modm). 


Z=0 '(m 


We are now in a position to prove the first of our two main theorems, 
namely 


THEOREM 204. e i8 transcendental, 


If the theorem is not true, then 
n 
(11.13.4) > Ged =0, 
t= 


where n > 1, Co, Cis Cp are integers, and C, + 0. 
We Jarpa that pis a T greater than max(n, \Col), and define 


p(x) by 
oe) =o te- 1)(w7—2)...(a—n)}?. 


Ultimately, p will be t If we multiply (11.13.4) by ¢(4), and use 
(11.13.3), we obtain 


Š agti È quod = 0, 
or k 


(11.13.5) S,+8, = 


say. 
By Theorem 203, with m = p, d(f)is an integer and 


$(h) = (—1)Pn(n!)? (mod p). 
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Again, if 1 <¿ <n, 


(t+x)2- xP 
o(t-+a) = Gone ee = (poi) 
where f(x) is an integral polynomial in x. It follows (again from Theo- 


rem 203) that ¢(t+-4) is an integer divisible by p. Hence 
n 
8, = 249+ h) = (—1)?"C,(n!)” 4 0 (modp), 


since Cy + 0 and p > max(n, |C,|). Thus 8, is an integer, not zero; and 
therefore 
(1.13.6) IS] > 1. 

On the other hand, le, (2)| < 1, by (11.13.1), and so 


WO < Sele 


{p-1 


< (C++). > 0 


(p—1) 
when p -> œ. Hence S, > 0, and we can make 

(11.18.7) [Sy] < 4 

by choosing a sufficiently large value of p. The formulae (11.13.5), 
(11.13.6), and (11.18.7) are in contradiction. Hence (11.13.4) is impos- 
sible and e is transcendental. 

The proof which precedes is a good deal more sophisticated than the 
simple proof of the irrationality of e given in § 4.7, but the ideas which 
underlie it are essentially the same, We use (i) the exponential series 
and (ii) the theorem that an integer whose modulus is less than 1 
must be 0. 


11.14. The transcendence of v. Finally we prove that 7 is 
transcendental. It is this theorem which settles the problem of the 
‘quadrature of the circle’. 

THEOREM 205. 7 is transcendental. 

The proof is very similar to that of Theorem 204, but there are one 
or two slight additional complications. 

Suppose that 8,, B,,..., Êm are the roots of an equation 

da™+d,a™1+4...4+d, = 0 
with integral coefficients. Any symmetrical integral polynomial in 
dB, das- dBm 
is an integral polynomial in 
di, dy,..., d 


m? 
and is therefore an integer. 
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Now let us suppose that 7 is algebraic. Then iz is algebraic,t and 
therefore the root of an equation 
da™+d,a™1+....4d,, =0, 


where m > 1, d, d,,..., d,, are integers, and d + 0. If the roots of this 


equation are 
q Wy) Woyerry Wms 


then 1+¢% = 1-+¢'7 = 0 for some w, and therefore 
(14-e™)(1+-e™)...(1-e%) = 0. 
Multiplying this out, we obtain 


P-1 

(11.14.1)’ 1 + > e% = 0, 
t= 

where 

(11.14.2) Ais Agserey Aom] 


are the 2”—-] numbers 
Wire Wm Dy Wo, Wy wg Wy Wg ef Win 
in some order. 
Let us suppose that C-I of the œ are zero and that the remaining 
n = 2™_]—(C—1) 
are not zero; and that the non-zero « are arranged first, so that (11.14.2) 
reads Ogres Ops 0,0... O 
Then it is clear that any symmetrical integral polynomial in 
(11.14.3) dags dan 
is a symmetrical integral polynomial in 
Aox,,..-do%,0, 0,..., 0, 
i.e. in daz, dag., Aoma: 
Hence any such function is a symmetrical integral polynomial in 


dwi, dwg., dm 
and so an integer. 
We can write (11.14.1) as 


(11.14.4) c+ 3 e% = 0. 
= 
We choose a prime p such that 
(11.14.5) p > max(d, C, |day...0,|) 


tIf azta a.t an = 0 and y = ix, then 
ay” — a yp. Hila y may) = 0 
and sọ (a, y" — dy yn + ,,.)?+ (a, yr —a, yr 44. ,.)? = 0. 
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and define 4(x) by 


d” +p-lyp 
(1.146)  ¢(z)= Mea lea) lea). 
Multiplying (11.14.4) by (h), and using (11.13.3), we obtain 
(1.14.7 S+S, +S; = 0, 
where 
(11.14.8) So = Cd(h), 
(1.14.9) S = Š sath), 
(11.14.10) c= 2 plaer. 


np 


= a 
Now wan di gix 


where g, is a symmetric integral polynomial in the numbers (11.14.3), 
and so an integer. It follows from Theorem 203 that ¢(h) is an integer, 
and that 


(11.14.11) (h) = go = (—1)?™d? (da, .da,.. . . dap)? (modp). 
Hence S, is an integer; and 
(11.14.12) So = Cg, Æ 0 (mod p), 


because of (11.14.5). 
Next, by substitution and rearrangement, we see that 


np-—1 
plata) (p—1)! D2 fut, 
where Su — filda; day, dag... dai dogy dan) 


is an integral polynomial in the numbers (11.14.3), symmetrical in all 
but dæ. Hence 


n N 
where F = Èi = 2 filda; dags. dagis Ajag dAn). 


It follows that #, is an integral polynomial symmetrical in all the num- 
bers (11.14.3), and so an integer. Hence, by Theorem 203, 


S = É gath) 
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is an integer, and 
(11.14.13) S, = 0 (modp). 


From (11.14.12) and (11.14.13) it follows that §,+.S, is an integer 
not divisible by p, and so that 


(11.14.14) S+S] 3 1. 
On the other hand, 


alr? ele 


Mol < EE (leloen > 0, 
for any fixed x, when p -> œ. It follows that 
(11.14.15) IS] < 4 


for sufficiently large p. The three formulae (11.14.7), (11.14.14), and 
(11.14.15) are in contradiction, and therefore 7 is transcendental. 

In particular 7 is not a ‘Euclidean’ number in the sense of § 11.5; 
and therefore it is impossible to construct, by Euclidean methods, a 
length equal to the circumference of a circle of unit diameter. 

It may be proved by the methods of this section that 


a CPt vy ePat ... tov, ebs #0 
if the œ and f are algebraic, the q are not all zero, and no two f are 
equal. 
It has been proved more recently that af is transcendental if g and 8 
are algebraic, œ is not 0 or 1, and B is irrational. This shows in particular 
that e-", which is one of the values of 274, is transcendental. It also 


shows that fis: log 3 


log 2 


is transcendental, since 24 = 3 and @ is irrational. 


NOTES ON CHAPTER XI 


§ 11.3. Dirichlet’s argument depends upon the principle “if there are n4 1 
objects in n boxes, there must be at least one box which contains two (or more) 
of the objects’ (the Schubfachprinzip of German writers). That in § 11.12 is 
essentially the same. 

$$ 11.6-7. A full account of Cantor’s work in the theory of aggregates (M engen- 
lehre) will be found in Bobson’s Theory of functions of a real variable, i. 

Liouville’s work was published in the Journal de Math. (1) 16 (1851), 133-42, 
over twenty years before Cantor’s. See also the note on §§ 11.13-14. 

Theorem 191 has been improved successively by Thue, Siegel, Dyson, and 
Gelfond. Finally Roth (Mathematika, 2 (1955), 1-20) showed that no irrational 
algebraic number is approximable to any order greater than 2. 


t See § 4.7. 
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§§ 11.8-9. Theorems 193 and 194 are due to Hurwitz, Math. Ann. 39 (1891), 
279-84; and Theorem 195 to Borel, Journal de Math. (5), 9 (1903), 329-75. Our 
proofs follow Perron (Kettenbriiche, 49-52, and Irrationalzahlen, 129-31). 

§ 11.10. The theorem with 242 is also due to Hurwitz, l.c. supra. For fuller 
information see Koksma, 29 et seq. 

Theorems 196 and 197 were proved by Borel, Rendiconti del circolo mat, di 
Palermo, 27 (1909), 247-71, and F. Bernstein, Math. Ann. 71 (1912), 417-39. 
For further refinements see Khintchine, Compositio Math. 1 (1934), 361-83, and 
Dyson, Journal London Math. Soc. 18 (1943), 40-43. 

§ 11.11. For Theorem 199 see Khintchine, Math. Ann. 92 (1924), 115-25, 

§ 11.12. We lost nothing by supposing p/q irreducible throughout §§ 11.1-1 1. 
Suppose, for example, that p/q is a reducible solution of (11.1.1). Then if 
(p, q) = d > 1, and we write p = dp’, q = dq’, we have p, q) = land 


r 
2-el= [2-2] <3 <i 
q q q q 
so that piť is an irreducible solution of (11.1.1). 

This sort of reduction is no longer possible when we require a number of rational 
fractions with the same denominator, and some of QUT conclusions here would 
become false if we insisted on irreducibility. For example, in order that the 
system (11.12.1) should have an infinity of solutions, it would be necessary, after 
§ 11.1 (1), that every £; should be irrational. 

We owe this remark to Dr. Wylie. 

§§ 11.13-14. The transcendence of e was proved first by Hermite, Comptes 
rendus, 77 (1873), 18-24, etc. (Huvres, iii. 150-81); and that of 7 by F. Lindemann, 
Math. Ann. 20 (1882), 213-25. The proofs were afterwards modified and simpli- 
fied by Hilbert, Hurwitz, and other writers. The form in which we give them is in 
essentials the same as that in Landau, Vorlesungen, iii. 90-95, or Perron, Irrational- 
zahlen, 174-82. 

The problem of proving the transcendentality of q®, under the conditions stated 
at the end of § 11.14, was propounded by Hilbert in 1900, and solved inde- 
pendently by Gelfond and Schneider, by different methods, in 1934. Fuller 
details, and references to the proofs of the transcendentality of the other numbers 
mentioned at the end of § 11.7, will be found in Koksma, ch. iv. 
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XII 


THE FUNDAMENTAL THEOREM OF ARITHMETIC 
IN k(l), ki), AND k(p) 


12.1. Algebraic numbers and integers. In this chapter we con- 
sider some simple generalizations of the notion of an integer. 

We defined an algebraic number in § 11.5; é is an algebraic number 
if it is a root of an equation 

Caér +e éI.. +e, = 0 (co # 0) 
whose coefficients are rational integers.f If 
C = 1, 

then ¢ is said to be an algebraic integer. This is the natural definition, 


since a rational € = a/b satisfies bé—a = 0, and is an integer when 
b= 1. 


Thus i = J(-1) 
and 
(12.1.1) p= gT = 4(— 1+3) 
are algebraic integers, since 
+= 0 
and p?+p+l1 = 0. 


When n = 2, € is said to be a quadratic number, or integer, as the 
case may be. 
These definitions enable us to restate Theorem 45 in the form 


THEOREM 206. An algebraic integer, if rationul, is a rational integer. 


12.2. The rational integers, the Gaussian integers, and the 
integers of k(p). For the present we shall be concerned only with 
the three simplest classes of algebraic integers. 

(1) The rational integers (defined in § 1.1) are the algebraic integers 
for which n = 1. For reasons which will appear later, we shall call the 
rational integers the integers of k( 1). 

(2) The complex or ‘Gaussian’ integers are the numbers 

£= a+bi, 

t We defined the ‘rational integers’ in § 1.1. Since then we have described them simply 
&S the ‘integers’, but now it becomes important to distinguish them explicitly from 
integers of other kinds. 

t We shall define k(8) generally in $ 14.1. k( 1) is in fact the clagg of rationals ; we shall 


not use 9 special symbol for the sub-class of rational integers. k(i) is the class of numbers 
r+si, where f and ¢ are rational; and k(p) is defined similarly. 
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where a and 0 are rational integers. Since 
&— 2af-+a?+b? = 0, 
a Gaussian integer is a quadratic integer. We call the Gaussian integers 
the integers of k(i). In particular, any rational integer is a Gaussian 
integer. 
Since (a+bi)+(c-+di) = (a+ce)+(b+d)i, 
(at+bi\(e+dt) = ac—bd+(ad+be)i, 
sums and products of Gaussian integers are Gaussian integers. More 
generally, if «,f,...,« are Gaussian integers, and 


é= P(a,B,...,«), 
where P is a polynomial whose coefficients are rational or Gaussian 
integers, then é is a Gaussian integer. 

(3) If p is defined by (12.1.1), then 

pP = etri — 4{—l1—iv3 
pte = —l, p= 
If é = a+bp, 
where a and b are rational integers, then 
(€—a—bp)(§—a—bp*) = 0 
or &—(2a—b)E+a?—ab+B? = 0, 
so that ¿is a quaclratic integer. We call the numbers é the integers of 
k(p). Since 
p+p+l= 0, a+bp= a—b—bp*, a+bp? = a-b-bp, 
we might equally have defined the integers of k(p) as the numbers 
a-+-bp?. 

The properties of the integers of k(i) and k(p) resemble in many ways 
those of the rational integers. Our object in this chapter is to study the 
simplest properties common to the three classes of numbers, and in 
particular the property of ‘unique factorization’. This study is im- 
portant for two reasons, first because it is interesting to see how far 
the properties of ordinary integers are susceptible to generalization, and 
secondly because many properties of the rational integers themselves 
follow most simply and most naturally from those of wider classes, 

We shall use small latin letters a, b,..., as we have usually done, to 
denote rational integers, except that į will always be (—1). Integers 
of ki) or k(p) will be denoted by Greek letters œ, £,.... 


12.3. Euclid’s algorithm. We have already proved the ‘funda- 
mental theorem of arithmetic, for the rational integers, by two different 


), 
1. 
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methods, in §§ 2.10 and 2.11. We shall now give a third proof which is 
important both logically and historically and will serve us as a model 
when extending it to other classes of numbers.t 

Suppose that azb>0. 
Dividing a by b we obtain a=q,b+7, 
where 0 <r, < b. If r, 40, we can repeat the process, and obtain 

b= Qarita 
where 0 < ra < rı If ry £ 0, 
Tı = fafat fa 
where 0 < r} < r,;and SO on. The non-negative integers b, 1, Tg., 
form a decreasing sequence, and SO 
; Tay 0 
for some n, The last two steps of the process will be 
Tn-2 = Inta-atln (0 < Tn < yaa) 
Tha — Onyan 

This system of equations for f4, fg}... is known as Euclid’s algorithm. 
It is the same, except for notation, as that of § 10.6. 

Euclid’s algorithm embodies the ordinary process for finding the 
highest common divisor of a and b, as is shown by the next theorem. 

THEOREM 207: 7, = (a, b). 


Let d = (a, b). Then, using the successive steps of the algorithm, we 


Baye d|a. d|b —>d|ri => dir, >... => d\r,, 


so that d<r,. Again, working backwards, 

ta | Pit = Ta ltn- on Trl Tn-s >... > Ta |b > Tala. 
Hence r, divides both a and b. Since d is the greatest of the common 
divisors of a and b, it follows that r, < d, and therefore that r, = d. 


12.4. Application of Euclid’s algorithm, to the fundamental 
theorem in k(1). We base the proof of the fundamental theorem 
on two preliminary theorems. The first is merely a repetition of 
Theorem 26, but it is convenient to restate it and deduce it from the 
algorithm. The second is substantially equivalent to Theorem 3. 


THEOREM 208. Iff la, f 6, then f (a, b. 


+ The fundamental idea of the proof is the game as that of the proof of § 2.10: the 
numbers divisible by d = (a, b) form a ‘modulus’. But here we determine d by q direct 
construction. 
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For fla Jib > firi fir > > fl tn 
or f|d. 
THEOREM 209. If (a, b) = land b UG then b |c. 
If we multiply each line of the algorithm by c, we obtain 
ac = q,6e+r,c, 


Tn- C= In Tn- C+Ty C, 
Tar = Intiln’s 
which is the algorithm we should have obtained if we started with UC 
and bc instead of a and b. Here 
r = (a, b) = 1 
and so (UC, be) = r „c= ©. 
Now b uc, by hypothesis, and b | be. Hence, by Theorem 208, 
b (UG bc) =c, 
which is what we had to prove. 

If p ig a prime, then either p a or (a,p)= 1. In the latter case, 
by Theorem 209, p | UC implies p c. Thus p | UC implies p a or p| c. 
This is Theorem 3, and from Theorem 3 the fundamental theorem 
follows as in § 1.3. 

It will be useful to restate the fundamental theorem in a slightly 
different form which extends more naturally to the integers of k(i) and 
k(p). We call the numbers esii 

-= 3 


the divisors of 1, the unities of k(1). The two numbers 
em 
we call associates. Finally we define a prime as an integer of ( 1) which 
is not 0 or a unity and is not divisible by any number except the unities 
and its associates. The primes are then 
+2, +3, +5...., 

and the fundamental theorem takes the form : any integer n of k( 1), not 
0 or a unity, can be expressed as a product of primes, und the expression 
is unique except in regard to (a) the order of the fuctors, (b) the presence 
of unities us fuctors, und (c) ambiguities between ussociuted primes. 

12.5. Historical remarks on Euclid’s algorithm and the funda- 
mental theorem. Euclid’s algorithm is explained at length in Book vii 
of the Elements (Props. 1-3). Euclid deduces from the algorithm, effec- 


tively, that fla.f|b— f| (a,b) 
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and (ac, bc) = (a,b)c. 
He has thus the weapons which were essential in our proof. 

The actual theorem which he proves (vii. 24) is ‘if two numbers be 

prime to any number, their product also will be prime to the same’; i.e. 
(12.5.1) (a, c) = 1. (6,c)= 1 — (ab,c) = 1. 
Our Theorem 3 follows from this by taking c a prime p, and we can 
prove (12.5.1) by a slight change in the argument of § 12.4. But Euclid’s 
method of proof, which depends on the notions of ‘parts’ and ‘propor- 
tion’ , is essentially diff erent . 

It might seem strange at first that Euclid, having gone so far, could 
not prove the fundamental theorem itself; but this view would rest 
on a misconception. Euclid had no formal calculus of multiplication 
and exponentiation, and it would have been most difficult for him even 
to state the theorem. He had not even a term for the product of more 
than three factors. The omission of the fundamental theorem is in no 
way casual or accidental; Euclid knew very well that the theory of 
numbers turned upon his algorithm, and drew from it all the return he 
could. 


12.6. Properties of the Gaussian integers. Throughout this and 
the next two sections the word ‘integer’ means Gaussian integer or 
integer of k(i). 

We define ‘divisible’ and ‘divisor’ in k(i) in the same way as in k(l); 
an integer é is said to be divisible by an integer 7, not 0, if there exists 
an integer ý such that é = nf; 


and 7 is then said to be a divisor of é. We express this by n | €. Since 
1, -1, i, —? are all integers, any é has the eight ‘trivial’ divisors 
1, €,-1, —€, 4, if, —t, —06. 
Divisibility has the obvious properties expressed by 
alg. Bly > aly, 
lyse. OLY, > a| Biyirt t+Be Yn: 

The integer ¢ is said to be a unity of k(i) if e |é for every ¢of k(t). 

Alternatively, we may define a unity as any integer which is a divisor 


of 1. The two definitions are equivalent,. since 1 js a divisor of every 


integer of the field, and 
e[1. 1| + elé. 


The norm of an integer £ is defined by 
NE = N(a+bi) = a+b. 


12.6 (210-12)} ARITHMETIC IN k(l), k(i), AND k(p) 183 


If £ is the conjugate of £, then 
Né = EE = |El. 
Since (a2-Lb2)(c? +d?) = (ac—bd)?+ (ad +be)?, 
Né has the properties 
NENy = N(En), NEN»... = N(E...). 
Tueorem 210. The norm of a unity is 1, and any integer whose norm 
is lisa unity. 
If eis a unity, then e 1. Hence 1 = €y, and so 
1=NeNy, Nell, Ne=1. 
On the other hand, if N(a+bi) = 1, we have 
1 = a?+b? = (a+b1)(a—bi), a+bi 1, 
and SO a+ bi is a unity. 
Tueorem 211. The unities of k(i) are 
€= if (s=0, 1,2,3). 
The only solutions of a#+6% = 1 are 
a = +1, b = 0; a=0, b= +1, 
so that the unities are +1, fi. 
If eis any unity, then @ $ is said to be associated with é, The associates 
of € are a ers 
and the associates of 1 are the unities. It is clear that if é | then 
fe, Nez, where e}, € are any unities. Hence, if y is divisible by ¢, any 
associate of 7 is divisible by any associate of é. 
12.7. Primes in (i). A prime is an integer, not 0 or a unity, 
divisible only by numbers associated with itself or with 1. We reserve 


the letter + for primes.f A prime 7 has no divisors except the eight 


trivial divisors l De os . . 
, 7, -1l, —7, i, ix, —t, —t7, 


The associates of a prime are clearly also primes. 
Tueorem 212. An integer whose norm is a rational prime is a prime. 
For suppose that NE = p, and that é = 7f. Then 
p= NE= NNG. 
Hence either Ny = 1 or N¢ = 1, and either 7 or ¢ is a unity; and there- 
fore € is a prime. Thus N(2+17) = 5, and 2+7 is a prime. 


f There will be no danger of confusion with the ordinary uge of 7, 
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The converse theorem is not true; thus N3 = 9, but 3 is a prime. 

For suppose that 3 = (atbi)(c+di). 

Then 9 = (a?-Lb2)(c2-+-d2). 

It is impossible that a?4-b2 = 4d = 3 

(since 3 is not the sum of two squares), and therefore either a?-+-6? =] 
or c?-+-d? = 1, and either a+bi or c+di is a unity. It follows that 3 
is a prime. 

A rational integer, prime in k(i), must be a rational prime; but not) 
| all rational primes are prime in k(i). Thus / 
Ll 5 = (2+i)(2—i). 

THEoREM 213. Any integer, not 0 or a unity, is divisible by a prime. 

If y is an integer, and not a prime, then 

Y= % fp No, > 1, NB, >], Ny = No, NB, 
and so 1< Na < Ny. 
If a, is not a prime, then 
ay = “Ba Nop > 1, NB, > 1, 
Na, = No, Nf, 1 < Nag < Noy. 
We may continue this process so long aSq, is not prime. Since 
Ny, Na, Nags... 
is a decreasing sequence of positive rational integers, we must sooner 


or later corne to a prime a,; and if a, is the first prime in the sequence 
Y, 4 My. then 


Y = P1% = BiBe% = oe = BißBeBa-Pr o,, 


and SO a, y. 
THEoreM 214. Any integer, not 0 or a unity, is a product of primes. 
If y is not 0 or a unity, it is divisible by a prime 7,, Hence 
Y = myg Ny < Ny. 
Either y, is a unity or 
Y1 = Toys Nya < Nyy. 
Continuing this process we obtain a decreasing sequence 
Ny, we,, Nass 
of positive rational integers. Hence Ny, = 1 for some r, and y, is a 
unity «€; and therefore 
Y = Minge, E = Tye Tp1 Tp 


where T, = 7, € is an associate of 7, and So itself a prime. 
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42.8. The fundamental theorem of arithmetic in k(i)}. Theorem 

214 shows that every y can be expressed in the form 

Y = Ty My. Tp 
where every 7 is a prime. The fundamental theorem asserts that, apart 
from trivial variations, this representation is unique. 

THEOREM 215 (THE FUNDAMENTAL THEOREM FOR GAUSSIAN 
INTEGERS). The expression of an integer as a product of primes is 
unique, apart from the order of the primes, the presence of unities, and 
ambiguities between associated primes. 

We use a process, analogous to Euclid’s algorithm, which depends 
upon 

Torm 216. Given any two integers y, yı, of which y, Æ 0, there is 
an integer « such that 

Y = KY tye, Nya < Ny. 
We shall actually prove more than this, viz. that 
N Ya S aN Yu 
but the essential point, on which the proof of the fundamental theorem 
depends, is what is stated in the theorem. If c and c, are positive rational 
integers, and c, Æ 0, there is a k such that 
c = keit tg, 0< & < Gy. 
It is on this that the construction of Euclid’s algorithm depends, and 
Theorem 216 provides the basis for a similar construction in k(i). 
Since y, Æ 0, we have 


Y = R+Si, 
Y1 
where R and S are real; in fact R and S are rational, but this is irre- 


levant. We can find two rational integers x and y such that 


|R- x4  IS—-yl <4; 
and then 
. ¥ 1 
7 — (etig) = |(R—2)+4(S—y)| = (R-a)P+(S—w"} < ze. 
1 
If we take k = a+ly, y2 =Y—-KY: 
we have ly—Ky,| < 2y, 


and $0, squaring, Ny, =- N(y—ky1) S wv.. 
We now apply Theorem 216 to obtain an analogue of Euclid’s 
algorithm. If y and y, are given, and yı # 0, we have 


Y = kyitye (Nye < Ny). 
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If Ya Æ 0, we have 
Yı = Kivatys (Nya < Ny); 
and SO on. Since Nyi Nya- 


is a clecreasing sequence of non-negative rational integers, there must 


be an 7 for which 
NY nt = 0, Yn+1 = 0, 


and the last steps of the algorithm will be 
Yn-2 = Ky-2Vn-1t Yn (Nyy < Nyn-1)s 
Yn-1 = Kn-1Yn° 

It now follows, as in the proof of Theorem 207, that y, is a common 
divisor of y and yı, and that every common divisor of y and y, is a 
divisor of yp. 

We have nothing at this stage corresponcling exactly to Theorem 207, 
since we have not yet defined ‘highest common divisor. If & is a common 
clivisor of y and y,, and every common clivisor of y and y, is a divisor 
of č, we call č a highest common divisor of y and y,, and write ¢ = (y, yı). 
Thus y, is a highest common divisor of y and yı. The property of (y, Yı) 
corresponcling to that proved in Theorem 208 is thus absorbecl into its 
definition. 

The highest common divisor is not unique, since any associate of a 
highest common divisor is also a highest common divisor. If y and ¢ 
are each highest common divisors, then, by the definition, 

|g Slo, 
and so f= $7, n= 0% = bn, 6 =l, 
Hence ¢ isa unity and ¢ an associate of y, and the highest common divisor 
îs unique except for ambiguity between associates. 

It will be noticecl that we defined the highest common clivisor of 
two numbers of k( 1) differently, viz. as the greatest among the common 
divisors, and provecl as a theorem that it possesses the property which 
we take as our definition here. We might define the highest common 
divisors of two integers of k(i) as those whose norm is greatest, but 
the definition which we have adopted lends itself more naturally to 
generalization. 

We now use the algorithm to prove the analogue of Theorem 209, viz. 

Tueorem 217. If (y, yı) = 1 and y; | By, then yı $. 

We multiply the algorithm throughout by ß and find that 

(By; By1) = Byw 


Since (y, yı) = 1, y, is a unity, and so 


(By, Pya) = P. 
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Now y; |y, by hypothesis, and y, | 8y,. Hence, by the definition of the 
highest common divisor, y, | (By, By) 
or y |$. 

If 7 is prime, and (7, y) = p, then p |7 and p |y. Since p 7, either 
(1) wis a unity, and so (7, y) = 1, or (2) p is an associate of 7, and so 
a y. Hence, if we take y, = 7 in Theorem 217, we obtain the analogue 
of Euclid’s Theorem 3, viz. 

Teorem 218. If 7 By,thenm | or my. 

From this the fundamental theorem for k(i) follows by the argument 
used for k(1) in § 1.3. 


12.9. The integers of k(p). We conclude this chapter with a more 
summary discussion of the integers 
é =atbp 
defined in § 12.2. Throughout this section ‘integer’ means ‘integer of 


k(p)’. 
We define divisor, unity, associate, and prime in k(p) as in k(i); but 


the norm of & = atbp is 
NE = (a+bp)(a+6p?) = a®—ab+b. 


Since a®—ab+b? = (a—4b)?+ 30?, 
Né is positive except when £ = 0. 

Since |a-+bp|? = a?—ab+b? = N(a+bp), 
we have NoaNB = N(a8), NoNB... = N(o8...), 
as in k(i). 


Theorems 210, 212, 213, and 214 remain true in k(p); and the proofs 
are the same except for the difference in the form of the norm. 

The unities are given by 

a’ —ab+b = 1, 

or (2a—b)?4-3b? = 4. 
The only solutions of this equation are 
a= +1, b = 0; a = 0, b = +l; a=1, b=1; a=-—1,b=—-1: 
so that the unities are 


+1, +p, +(1+p) 
or +1, +p, +ø. 
Any number whose norm is a rational prime is a prime; thus I-p is 
a prime, since N( 1 —p) = 3. The converse is false; for example, 2 is a 


prime. For if 2 = (a+bp)(c+dp), 
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then 4 = (a*?—ab+5*)(c?—cd-+d?). 
Hence either @+bp or ¢+dp is a unity, or 
a?—ab+b? = 12, (2a—b)?+3b? = +8, 

which is impossible. 

The fundamental theorem is true in k(p) AsO, and depends on a 
theorem verbally identical with Theorem 216. 

THEOREM 219. Given any two integers y, Yis of which Yi x 0, there is 
Un integer K SUCh that 

v= kyty» w2 < Ny 


For 


X, a+bp (a+bp)\c+dp?) ac+bd—ad+(be—ad)p _ R4Sp 
yi= ¢+dp= (e+dp)(c+dp*)= ced + de i 


say. We Can find two rational integers x and y such that 


|R—z|< 4,  |S-y| <3, 


and then 
2 


Y_ (utyp) = (R—2)?—(R—x)(S—y)+(S—y)? < 2. 


Yı 
Hence, if K = Z+YP, Yz = Y—KYp we have 
Nys = N(y—ry) < Ny < Ny. 

The fundamental theorem for k(p) follows from Theorem 219 by the 
argument used in § 12.8. 

THEOREM 220 [THE FUNDAMENTALTHEOREM FOR k(p)]. The expres- 
sion of an integer of k(p) asa product of primes is unique, apart from 
the order of the primes, the presence of unities, and ambiguities between 


associated primes. 


We conclude with a few trivial propositions about the integers of 
k(p) which are of no intrinsic interest but will be required in Ch. XIII. 


TueoremM 221. A == 1 —p is a prime. 
This has been proved already. 


THEOREM 222. All integers of k(p) fall into three classes (mod A), 
typified by 0, l, and = 1. 


The definitions of a congruence to modulus À, a residue (modà), and 
a Class of residues (mod A), are the SAME as in k( 1). 
If y is any integer of k(p), we have 


= a+bp = at+b—bA=a+b(modh). 


y 
Since3 = (l— )(1—p?), A | 3jand since a + b has ONE of the three residues 
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0, 1, ‘-1 (mod 3), y has one of the same three residues (mod À). These 
residues are incongruent, since neither N1 = 1 nor N2 = 4 is divisible 


by NA = 3. 
THEOREM 223. 3 is associated with A?, 
For à? = 1—2p—p* = -3p. 


THEOREM 224. The numbers +(1—p), +( 1—p*), +p( 1 —p) are all 
associated with A. 
For 
+(1—p) = +, +(1—p?) == Fà, +p(1—p) = Ap. 


NOTES ON CHAF’TER XII 


§ 12.1. The Gaussian integers were used first by Gauss in his researches on 
biquadratic reciprocity. See in particular his memoirs entitled ‘Theoria resi- 
duorum biquadraticorum’, Werke, ii. 67-148. Gauss (here and in his memoirs 
on algebraic equations, Werke, iii. 3-64) was the first mathematician to use 
complex numbers in a really confident and scientific way. 

The numbers a + bp were introduced by Eisenstein and J acobi in their work on 
cubic reciprocity. See Bachmann, Allgemeine Arithmetik der Zahlkdrper, 142. 

$ 12.5. We owe the substance of these remarks to Prof. S. Bochner. 


XIII 
SOME DIOPHANTINE EQUATIONS 


13.1. Fermat’s last theorem. ‘Fermat’s last theorem’ asserts that 
the equation 
(13.1.1) en y” = 2z", 
where n is an integer greater than 2, has no integral solutions, except 
the trivial solutions in which one of the variables is 0. The theorem has 
never been proved for all n, or even in an infinity of genuinely distinct 
cases, but it is known to be true for 2 <n < 619. In this chapter we 
shall be concerned only with the two simplest cases of the theorem, in 
which n = 3 and n = 4. The case n = 4 is easy, and the case n = 3 
provides an excellent illustration of the use of the ideas of Ch. XII. 

13.2. The equation x?+-y? = z2, The equation (13.1.1) is soluble 
when n = 2; the most familiar solutions are 3, 4, 5 and 5, 12, 13. We 
dispose of this problem first. 

It is plain that we may suppose x, y, a positive without loss of 


generality. Next d|x. djy —» diz. 


Hence, if x, y, z is a solution with (x, y) = d, then x = dz’, y = dy’, 
z= dz',and 2’, y’, z'is a solution with (x, y) = 1. We may therefore 
suppose that (x, y) = 1, the general solution being a multiple of a 
solution satisfying this condition. Finally 
x = 1 (mod 2). y = 1 (mod 2) > z2 = 2 (mod 4), 

which is impossible; So that one of x and y must be odd and the other 
even. 

It is therefore sufficient for our purpose to prove the theorem which 
follows. 

THEOREM 225. The most general solution of the equation 


(13.2.1) ety? = 2, 

satisfying the conditions 

(13.2.2) z>0,y >0,z> 0, (x, y) = 1, 2 |z, 
is 

(18.2.3) x = 2ab,y = a—b?, z= a@a?+b?, 
where a, b are integers of opposite parity and 

(13.2.4) (a,b) =1, a>b>0. 


There is a (1,1) correspondence between different values of a, b and different 
values of x,y, z. 
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First, let us assume (13.2.1) and (13.2.2). Since 2 x and (x,y) = 1, 
y and z are odd and (y,z) = 1. Hence 4(z—y) and }(z+y) are integral 


oe ex zta) 


TE 


and the two factors on the right, being coprime, must both be squares. 


z+y = q? z—y b?, 


Hence j ; 7 = 
where a > 0, b >00, a> b, (a, b)= 1. 
Also a+b = a+b? = z = 1 (mod 2), 


and a and b are of opposite parity. Hence any solution of (13.2.1), 
satisfying (13.2.2), is of the form (13.2.3); and a and b are of opposite 
parity and satisfy (13.2.4). 
Next, let us assume that a and b are of opposite parity and satisfy 
(13.2.4). Then 
x+y? = 4a?b? -+ (a?— b?)? = (a? b?)? = 2, 
a>0,y>0,2>0, 2 x. 
If (x, y) = d, then d | z, and so 
d|y = a’—b*, d|z= a?+6?; 
and therefore d | 2a®, d | 2b?. Since (a, b) = 1, d must be 1 or 2, and the 
second alternative is excluded because y is odd. Hence (x, y) = 1. 
Finally, if y and z are given, a? and b?, and consequently a and b, are 
uniquely determined, so that different values of x, y, and z correspond 
to different values of a and b. 
13.3. The equation x*+y* = 2t, We now apply Theorem 225 to 
the proof of Fermat’s theorem for n = 4. This is the only ‘easy’ case 
of the theorem. Actually we prove rather more. 


TurorEM 226. There are no positive integral solutions of 


(13.3.1) attyt = z. 
Suppose that u is the least number for which 
(13.3.2) attyt =u? (x>0,y>0, u> 0) 


has a solution, Then (x, y) = 1, for otherwise we can divide through 
by (x, y)4and so replace y by a smaller number. Hence at least one of 


x and y, is odd, and 
u? = z*+-y* = 1 or 2 (mod4). 
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Since y? = 2 (mod4) is impossible, u is odd, and just one of 7 and y 
is even. 
If x, say, is even, then, by Theorem 225, 


x? = 2ab, y? = @— b, u = @?4+b?, 
a > 0, b >00, (@,b)= 1, 
and a and b are of opposite parity. If a is even and b odd, then 
y? = -1 (mod4), 
which is impossible; so that a is odd and b even, say b = 2c. 
Next (4x)? = UC (a, c) = 1; 
and so a=d*, c=f?, d>0, f > 9, (df) = 1, 
and d is odd. Hence 
y? = ab? = dt—4ft, 
Op Pty = @), 
and no two of 2f?, y, d? have a common factor, 
Applying Theorem 225 again, we obtain 


2f? = 2m, = Pm, L>0,-m> 0, (I,m)= 1. 


Since f? = Im, (l,m) = 1, 

we have 1=r, m= g (r >0,s> 0), 
and so ratsi= œ. 

But d S d=a <a < a+b = u, 


and so u is not the least number for which (13.3.2) is possible. This 
contradiction proves the theorem. 

The method of proof which we have used, and which was invented 
and applied to many problems by Fermat, is known as the ‘method of 
descent’. If a proposition P(n) is true for some positive integer n, there 
is a smallest such integer. If P(n), for any positive n, implies P(n’) for 
some smaller positive n’, then there is no guch smallest integer; and 
the contradiction shows that P(n) is false for every n. 


13.4. The equation 28+y3 = 2. If Fermat’s theorem is true for 
some n, it is true for any multiple of n, since gin yin = zin js 
(HU = (2). 
The theorem is therefore true generally if it is true (a) when n = 4 (as 


we have shown) and (b) when n is an odd prime. The only case of (b) 
which we can discuss here is the case n = 3. 
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The natural method of attack, after Ch. xII, is to write Fermat’s 
equation in the form 


(c+-y)(e+py)etp’y) = 2, 
and consider the structure of the various factors in k(p). As in § 13.3, 
we prove rather more than Fermat’s theorem. 
TuzoreM 227. There are no soldions of 
St+P+G= 0 40,740,049) 
in integers of k(p). In particular, there are no solutions of 
BHP = 23 
in rational integers, except the trivial solutions in which one of x, y, z is 0. 
In the proof that follows, Greek letters denote integers in k(p), and 
A is the prime 1—p.t We may plainly suppose that 
(13.4.1) (n= Gs Eek 
We base the proof on four lemmas (Theorems 228-31). 
Torm 228. If wis not divisible by À, then 
w? = +1 (mod). 
Since w is congruent to one of 0, 1, -1, by Theorem 222, and A f w, 
we have w = +1 (modh). 
We can therefore choose y = +w SO that 
a =1 (modà), a= 1+. 
Then +(w3-F 1) = —l= (a—1)(a—p)(a—p?) 
=PA(BA-+ 1 —p)(BA+ 1 —p?) 
= \B(B-+1)(B—p?), 
since 1—p* = A( 1+p) = —Ap*. Also 
œ = 1(modA), 
so that B(B+1)(8B—p?) = B(B+1)(B—1) (mod A). 
But one of £, 8+1, 8—1 is divisible by A, by Theorem 222; and so 
+(w F1) = 0 (mod A’) 
or w = +1 (modd‘). 
THEOREM 229. If & +4734 [3 = 0, then one of &, 7, t 18 divisible by À. 
Let us suppose the contrary. Then 
0 = EHHE = £14141 (moda), 
and so + 1 = 0 or +3 = 0, ie. àt | 1 or A4| 3. The first hypothesis is 


t See Theorem 221. 


li 


til 


5591 0 
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untenable because A is not a unity; and the second because 3 is an 
associate of A*} and therefore not divisible by Àt, Hence one of é, 9, $ 
must be divisible by À. 
We may therefore suppose that A | {, and that 
4 = Any, 
where A f y. Then À f £, à f n, by (13.4.1), and we have to prove the im- 
possibility of 


(13.4.2) EHHAn — 0, 

where 

(13.4.3) (én) =1,n >l, AJE Afm ALy- 
It is convenient to prove more, viz. that 

(13.4.4) B+ 73+ cASry3 =0 


cannot be satisfied by any £, 7, y subject to (13.4.3) and any unity e . 


TueoreM 230. If £, », andy satisfy (13.4.3) and (13.4.4),thenn > 2. 
By Theorem 228, 


— ASB = 4 y3 = +141 (modd’). 
If the signs are the same, then 
—ehry8 = +2 (moda), 
which is impossible because À} 2. Hence the signs are opposite, and 
—A3"y3 = 0 (mod A‘). 
Since À fy, n > 2. 

TuEorEM 23 1. If (13.4.4) is possible for n = m > 1, then it is possible 
for n = m-1. 

Théorem 231 represents the critical stage in the proof of Theorem 227; 
when it is proved, Theorem 227 follows immediately. For if (13.4.4) is 
possible for any n, it is possible for n = 1, in contradiction to Theorem 
230. The argument is another example of the ‘method of descent’. 

Our hypothesis is that 


(13.4.5) Amy? = (E+m)(E+pn)(E+p*n). 


The differences of the factors on the right are 


mA, pnd, —p*A, 
all associates of 7A. Each of them is divisible by A but not by A? (since 
À} n). 
Since m > 2, 3m > 3, and one of the three factors must be divisible 
by à?. The other two factors must be divisible by À (since the differences 


J Theorem 223. 
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are divisible), but not by A* (since the differences are not). We may 
suppose that the factor divisible by A? is £+-7; if it were one of the other 
factors, we could replace 7 by one of its associates. We have then 
(13.4.6) Etyn = BM, E+py = Akg, É+ pN = Akg, 
where none of k, Korais divisible by A. 

If ô| kand 5 «x, then ô also divides 

Ke Kg = P 

and Pkg—p'k, = pé, 
and therefore both é and y. Hence § is a unity and (kak) = 1. 
Similarly (ks, «,;)= 1 and (#,,*,)= 1. 

Substituting from (13.4.6) into (13.4.5), we obtain 

—ey? = Ky Kg kK. 

Hence each of ,,k»,k3is an associate of a cube, so that 

E+ 7 = AM- = e 430-203, Etpyn = e Ad’, E+p'y - €, Ax’, 
where 0, ¢, $ have no common factor and are not divisible by A, and 
€, ® 2, © a are unities. It follows that 


0 = (It+ptp' E+) = Stntplé+pn)+p(E+p?n) 


= €,A8™- 2934 e, pAd?+ €; PPAP; 
and so that 


(13.4.7) P +e y+, Adm-393 = O, 
where e} = e} p€, and €; = €,/€,p are also unities. 


Now m > 2 and so 
+e y? = 0 (mod A?) 


(in fact, mod A’). But A f $ and à } #, and therefore, by Theorem 228, 
$ = +1 (mod), y? =+ 1 (mod A?) 
(in fact, mod A*). Henae 
+1+e,=0 (mod )?). 
Here eis +1, +p, or + p?. But none of 
+1tp, +l 
is divisible by \?, since each is an associate of 1 or of A; and therefore 
e, = fl. 
If oe (13.4.7) is an equation of the type required. If «į = -1, 


we replace 4 by —y. In either case we have proved Theorem 231 and 
therefore Theorem 227. 
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13.5. The equation «+ 43 = 323. Almost the same reasoning will 
prove 

TuzoreM 232. The equution 

x+y = 323 

has no solutions in integers, except the trivial solutions in which z = 0. 

The proof is, as might be expected, substantially the same as that 
of Theorem 227, since 3 is an associate of A?, We again prove more, viz. 
that there are no solutions of 
(13.5.1) B+ 8+ €Adn+28 = O, 
where (é 7) = 1, Aly, 
in integers of k(p). And again we prove the theorem by proving two 
propositions, viz. 

(a) if there is a solution, then n > 0; 

(b) if there is a solution for n = m > 1, then there is a solution for 

n = m-l; 

which are contradictory if there is a solution for any n. 

We have (E+ nét eon (EHen) = — Amt. 
Hence at least one factor on the left, and therefore every factor, is 
divisible by A; and hence m > 0. It then follows that 3m+2 > 3 and 
that one factor is divisible by \?, and (as in § 13.4) only one. We have 
therefore 

E+n = RM, — Etpy = Aka, = Etp'n = Akg, 

the «being coprime in pairs and not divisible by À. 

Hence, asin § 13.4, —ey® = Ky Kgka, 
and x1, ,K,are the associates of cubes, so that 

Etg = e A268, E+pyn = €,A¢°, E+p'n = cgay. 
It then follows that 
O = E+ntplEtpn)t+p'(E+p?y) = €, 2°" + ee pA? + eg pA, 
Pepe; mI = 0; 

and the remainder of the proof is the same as that of Theorem 227. 

It is not possible to prove in this way that 


(18.5.2) E34 3-f-edntlys E O. 

In fact 13+. 23-+9(—1)8 = 0 

and, since 9 = pA‘,t this equation is of the form (13.5.2). The reader 

will find it instructive to attempt the proof and observe where it fails. 
+ See the proof of Theorem 223. 
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13.6. The expression of a rational as a sum of rational cubes. 
Theorem 232 has a very interesting application to the ‘additive’ theory 
of numbers. 

The typical problem of this theory is as follows. Suppose that x 
denotes an arbitrary member of a specified class of numbers, such as 
the class of positive integers or the class of rationals, and y is a member 
of some sub-class of the former class, such as the class of integral squares 
or rational cubes. Is it possible to express x in the form 


£= YytYot-FYx; 

and, if so, how economically, that is to say with how small a value of k? 

For example, suppose x a positive integer and y an integral square. 
Lagrange’s Theorem 369+ shows that every positive integer is the sum 
of four squares, SO that we may take ķ = 4. Since 7, for example, is 
not a sum of three squares, the value 4of k is the least possible or the 
‘correct’ one. 

Here we shall suppose that x is a positive rational, and y a non-negative 
rational cube, and we shall show that the ‘correct’ value of | is 3. 

In the first place we have, as a corollary of Theorem 232, 


THEOREM 233. There are positive rationals which are not sums of two 
non-negatice rational cubes. 


For example, 3 is such a rational. For 


Oars 
b do 
involves (ad -+ (bc) = 3(bd)}, 


in contradiction to Theorem 232. { 
In order to show that 3 is an admissible value of k, we require another 
theorem of a more elementary character. 


THEOREM 234. Any positive rational is the sum of three positive rational 
cubes. 
We have to solve 
(13.6.1) r= B+tyizs, 
where r is given, with positive rational x, y, z. It is easily verified that 
Hyt = (e+y-+2—3yta)z+a\(c+y) 


+ Proved in various ways in Ch. XX. 
} Theorem 227 shows that 1 is not the sum of two positive rational cubes, but it is of 
course expressible as 0°+ 13. 
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and SO (13.6.1) is equivalent to 


(etytz)—3(y+z)(2+a)(e+y) = r. 
If we write X = y+z, Y = z+2,Z = x+y, this becomes 


(13.62) (X+Y+4Z)%—24XVZ = £r. 
If we put 
(13.6.3) aoe v= > 
(13.6.2) becomes 
(13.6.4) (u+-v)3—24v(u— 1) = 8rZ-3, 
Next we restrict Z and v to satisfy 
(13.6.5) r = 32v, 
so that (13.6.4) reduces to 
(13.6.6) (u+ = 24w, 
To solve (13.6.6), we put u = øt and find that 
2 24t 
(13.6.7). y= I DST 


This is a solution of (13.6.6) for every rational ¢, We have still to satisfy 
(13.6.5), which now becomes 


r(t+1} = 722%. 


If we put { = 7r/(72w%), where w is any rational number, we have 
Z = w(t+1). Hence a solution of (13.6.2) is 


(13.6.8) x = (u- 1)Z, Y = vZ, Z = w(t+1), 
where u, v are given by (13.6.7) with ¢ = rw-?/72. We deduce the solu- 
tion of (13.6.1) by using 
(13.6.9) 2x = Y+Z—X, 2y = Z4+X—Y, 22 = X+Y—Z. 
To complete the proof of Theorem 234, we have to show that we can 


choose w so that x, y, z are all positive. If w is taken positive, then ¢ and 
Z are positive. Now, by (18.6.8) and (13.6.9) we have 


75 vfl-(u-l) = 2+v—u, ve u-v, 7 utv-2. 


These are all positive provided that 
uU>v uv < 2 < uty, 


that is t> l, 12t(t—1) < (t+1)8 < 12t(t+1), 
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These are certainly true if ¢ is a little greater than 1, and we may choose w 


so that F 


= 73u? 
satisfies this requirement. (In fact, it is enough = 1<t<2,) 
Suppose for example that r = 3. If we put w = 3 so that t= 2, we 


ate ? = (P+ +0). 
The equation l 
which is equivalent to 
(13.6.10) 6? = 334 43+.5%, 
is even simpler, but is not obtainable by this method. 
13.7. The equation 2?-+y?+23 = t. There are a number of other 


Diophantine equations which it would be natural to consider here; and 
the most interesting are 


(13.7.1) etyt3 — 8 
and 
(13.7.2) 2HY? o u uty, 


The second equation is derived from the first by writing —u, v for z, t, 

Each of the equations gives rise to a number of different problems, 
since we may look for solutions in (a) integers or (b) rationals, and we 
may or may not be interested in the signs of the solutions. The simplest 
problem (and the only one which has been solved completely) is that 
of the solution of the equations in positive or negative rationals. For 
this problem, the equations are equivalent, and we take the form 
( 13.7.2). The complete solution was found by Euler and simplified by 
Binet, 

If we put 

z= X-Y, y= X4+Y,u = u-v, v = UF, 

(13.7.2) becomes 


(13.7.3) X(X?+43Y?).=U(U2+3V%). 
We suppose that X and Y are not both 0. We < then write 
U+V,j/(—3) U— Vi(— 
= b: TRF 
Pa e aa E 
where a, b are rational. From the first of these 
(13.7.4) U =aX—3bY, V = bX +aY, 


while (13,7.3) becomes X = U(a?4-36?). 
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This last, combined with the first of (13.7.4), gives us 
cx = dY, 

where c = a(a@+3b?})—1, d = 3b(a?+3b?). 
Ifc = d= 0, then b = 0, a = 1, X = U,Y = JV, Otherwise 
(13.7.5) X= Ad = 3Ab(a2+3b%),  Y=Xc= Na(a?+3b%)— 1}, 
where A + 0. Using these in (13.7.4), we find that 
(13.7.6) U = 3b, V = M(a2+3b2)?—a}. 
Hence, apart from the two trivial solutions 

X=Y=U=0; x =U, Y = vy, 
every rational solution of (18.7.3) takes the form given in (13.7.5) and 
(18.7.6) for appropriate rational 4, a, b. 

Conversely, if A, a, b are any rational numbers and X, Y, U, V are 
defined by (13.7.5) and (13.7.6), the formulae (13.7.4) follow at once 
and 

U(U?+3V") = 3ab{(aX —3bY)?+3(bX +aY)} 


= 3Ab(a?+-36?)(X2+3¥?) = X(X24-3Y2). 
We have thus proved 


THEOREM 235. Apurt from the trivial solutions 
(13.7.7) L£=y=0,u=-v; =u, y= V, 
the general rational solution of (13.7.2) is given by 
x = à{1—(a—3b)(a?+30?)}, y= A{(a+3b)(a?+3b?)—1}, 
u = Àà{(a4+-3b)—(a?4-30?)°}, v = A{(a?+3b?)?— (a—3b)}, 
where A, a, b are any rational numbers except that A Æ 0. 


(18.7.8) 


The problem of finding all integral solutions of (13.7.2) is more diffi- 
cult. Integral values of a, b and À in (13.7.8) give an integral solution, 
but there is no converse correspondence. The simplest solution of 
(13.7.2) in positive integers is 
(13.7.9) x=1, y = 12, u = 9, v= 10 
corresponding to 

a = tp b = —%, = — 
On the other hand, if we put a = b = 1, A= 4, we have 
x= 3, y=5,u= —4, v= ô, 
equivalent to (13.6.12). 
Other simple solutions of (13.7.1) or (13.7.2) are 


13463483 = 98 23.1343 = 15831333, = 98.4. 158 = 234-168, 
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Ramanujan gave 

x = 3a?+ 5ab—5b?, y = 4a?—4ab+60?, 

z = 5a*—5ab—3b?, t = 6a%—4ab+-4b? 
as a solution of (13.7.1). If we take a = 2, b = 1, we obtain the solu- 
tion (17, 14, 7, 20). If we take a == 1, b = -2, we obtain a solution 
equivalent ‘to (13.7.9). Other similar solutions are recorded in Dick- 
son’s History. 

Much less is known about the equation 


(13.7.10) xt yt = ui ot, 
first solved by Euler. The simplest parametric solution known is 

x = a?+a5b?— 2a3b4-+ 3a7b5+ ab, 

y = a®b—3a5d?-— 2a4b3 + 755-0’, 

u = a’+a5b?—2ab4— 3a7b>+ ab§, 

v = ab + 3a5b?-— 2a4b?+.a7b5 +57, 
but this solution is not in any sense complete. When a = 1, b= 2 it 
ade 133441344 == 1581-4594, 
and this is the smallest integral solution of (13.7. 10). 

TO solve (13.7.10), we put 
(13.7.12) x = aw+¢, y = bw-d, u = aw+d, v = bwte., 
We thus obtain a quartic equation for w, in which the first and last 
coefficients are zero. The coefficient of w3 will also be zero if 
c(a3—b3) =: d(a? +03), 

in particular if c = a3+63,d = a?—b®; and. then, on dividing by w, we 
find that 3yy(q2—b2)(ce2—d®) = 2(ad®—ac? be? + bd?) 
Finally, when we substitute these values of c, d, and w,in (13.7.12), 


and multiply throughout by 3a2b?, we obtain (13.7.11). 
We shall say something more about problems of this kind in Ch. XXI 


(13.7.11) 


NOTES ON CIIAPTER XIII 


§ 13.1. Al] this chapter, up to § 13.5, js modelled on Landau, Vorlesungen, iii. 
201-17. 

The phrase ‘Diophantine equation’ is dorived from Diophantus of Alexandria 
(about a.». 250), who was the first writer to make a systematic study of tho 
solution of equations in integers. Diophantus proved the substance of Theorem 
225. Particular solutions had been known to Greek mathematicians from 
Pythagoras onwards. Heath’s Diophantus of Alexandria (Cambridge, 1910) 
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includes translations of all the extant works of Diophantus, of Fermat’s com- 
ments on them, and of many solutions of Diophantine problems by Euler. 

There is a very large literature about ‘Fermat’s last theorem’. In particular 
we may refer to Bachmann, Das Fermatproblem; Dickson, History, ii, ch. xxvi; 
Landau, Vorlesungen, iii; Mordell, Three lectures on Fermat’s last theorem (Cam- 
bridge, 1921); Noguès, Théorème de Fermat, son histoire (Paris, 1932); Vandiver, 
Report of the committee on algebraic numbers, ii (Washington, 1928), ch. ii, and 
Amer. Math. Monthly, 53 (1946), 555-78. 

The theorem was enunciated by Fermat in 1637 in a marginal note in his copy 
of Bachet’s edition of the works of Diophantus. Here he asserts definitely that 
he possessed a proof, but the later history of the subject seems to show that he 
must have been mistaken. A very large number of fallacious proofs have been 
published. 

In view of the remark at the beginning of § 13.4, we Can suppose that n = p> 2. 
Kummer (1850) proved the theorem for n = p, whenever the odd prime p is 
‘regular’, i.e. when p does not divide the numerator of any of the numbers 

B,, Bisio Biip-3)» 

where B, is the kth Bernoulli number defined gt the beginning of § 7.9. It is known, 
however, that there is an infinity of ‘irregular’ p. Various criteria have been 
developed (notably by Vandiver) for the truth of the theorem when p is irregular. 
The corresponding calculations have been carried out on the high-speed computer 
SWAC and agg a result, the theorem jg now known to be true for all p < 4002. 
See Lehmer, Lehmer and Vandiver, Proc, Nat. Acad, Sci (U.S.A.) 40 (1954), 
25-33 ; Vandiver, ibid. 732-5, and Selfridge, Nicol and Vandiver, ibid. 41 (1955), 
970-3, 

The problem is much simplified if it is assumed that no one of 2, y, Z is divisible 
by p. Wieferich proved in 1909 that there are no such solutions unless 2?—! = 
(mod p?), which is true for p = 1093 (§ 6.10) but for no other p less than 2000. 
Later writers have found further conditions.of the same kind and by this means 
it has been shown that there are no solutions of this kind for p < 253,747,889. 
See Rosser, Bulletin Amer. Math. Soc. 46 (1940), 299-304, and 47 (1941), 109-10, 
and Lehmer and Lehmer, ibid. 47 (1941), 139-42. 

§ 13.3. Theorem 226 was actually proved by Fermat, See Dickson, History, ii, 
ch. xxii. 

§ 13.4. Theorem 227 was proved by Euler between 1753 and 1770. The proof 
was incomplete at one point, but the gap was filled by Legendre. See Dickson, 
History, ii, ch. xxi. 

Our proof follows that given by Landau, but Landau presents it as a first 
exercise in the use of ideals, which we have to avoid. 

$13.6. Theorem 234 is due to Richmond, Proc, London Math. Soc. (2) 21 (1923), 
401-9, His proof is based on formulae given much earlier by Ryley /The ladies’ 
diary (1825), 35]. 

Ryley’s formulae have been reconsidered and generalized by Richmond [Proc. 
Edinburgh Math. Soc. (2) 2 (1930), 92-100, and Journal London Math. Soc. 17 
(1942), 196-7] and Morde11 [Journal London Math. Soc. 17 (1942), 194-6]. Rich- 
mond finds solutions not included in Ryley’s; for example, 

31 —t+@)x = (146), 3(1-t+e)y = 0(8t-1—#), 3(1—t+t)z = 8(3t—3¢*), 
where s is rational and ¢ = 3r/s3, Mordell solves the more general equation 


(X+Y+4Z)-dXYZ = m, 


Notes] SOME DIOPHANTINE EQUATIONS 203 


of which (13.6.2) is a particular case. Our presentation of the proof is based on 
Mordell’s. There are a number of other papers on cubic Diophantine equations in 
three variables, by Morde11 and B. Segre, in later numbers of the Journal. See also 
Mordell, A chapter in the theory of numbers (Cambridge 1947), for an account of 
work on the equation y? == z? + k. 

§ 13.7. The first results concerning ‘equal sums of two cubes’ were found by 
Vieta before 1591. See.Dickson, History, ii. 550 et seq. Theorem 235 is due to 
Euler. Our method follows that of Hurwitz, Math. Werke, 2 (1933), 469-70. 

Euler’s solution of (13.7.10) is given in Dickson, Introduction, 60-62. His 
formulae, which are not quite so simple as (13.7.11), may be derived from the 
latter by writing f +g and f - Q for a and b and dividing by 2. The formulae 
(13.7.11) themselves were first given by Gérardin, LL’ Intermédiaire des mathé- 
maticiens, 24 (1917), 51. Thc simple solution here is due to Swinnerton-Dyer, 
Journal London Math. Soc. 18 (1943), 2-4. 

Leech (Proc. Cambridge Phil, Soc. 53 (1957), 778-80) lists numerical solutions of 
(13.7.2), of (13.7.10), and of several other Diophantine equations. 


XIV 
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14.1. Algebraic fields. In Ch. XII we considered the integers of 
k(i) and k(p), but did not develop the theory farther than was necessary 
for the purposes of Ch. XIII. In this and the next chapter we carry 
our investigation of the integers of quadratic fields a little farther. 

An algebraic field is the aggregate of all numbers 

P(3) 
R(ĝ) = 015)’ 
where ® is a given algebraic number, P(8) and Q(6) are polynomials 
in ð with rational coefficients, and Q(9) + 0. We denote this field by 
k(9). It is plain that sums and products of numbers of £(#) belong to 
k(6) and that a/f belongs to k(B) if œ and B belong to k(8) and 8 + 0. 

In § 11.5, we defined an algebraic number £ as any root of an algebraic 

equation 


(14.1.1) ayx"-+a,2"-1+,..44, = 0, 


where a,,, a,,.... are rational integers, not all zero. If é satisfies an 
algebraic equation of degree n, but none of lower degree, we say that 
é is of degree n. 

If n = 1, then £ is rational and k(€) is the aggregate of rationals. 
Hence, for every rational £, kf) denotes the same aggregate, the field 
of rationals, which we denote by k(1). This field is part of every 
algebraic field. 

If n = 2, we say that ¿is ‘quadratic’. Then ¿is a root of a quadratic 


equation 2 2 
agtta t+, = 0 


= atbvm Mii cc cé—a 
c b 
for some rational integers a, b, c, m. Without loss of generality, we may 
take m to have no squared factor, It is then easily verified that the 
field k(f) is the same aggregate as k(Wm). Hence it will be enough for 
us to consider the quadratic fields k( vm) for every ‘quadratfrei’ rational 
integer m, positive or negative (apart from m = 1). 
Any member é of k(vm) has the form 


P(vm) t-+-uvm _— (t+uvm)\(v—wvm)  a+bvm 
Q(vm) vwm ~ v—wem C 


and so é 


g 
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for rational integers t, u, v, w, a, b, c. We have (c&—a)* = mb?, and so 
€ is a root of 


(14.1.2) ex? — 2aca+a2—mb? = 0. 


Hence £ is either rational or quadratic; i. every member of a quadratic 
field is either a rational or a quadratic number. 

The field k(vm) includes a sub-class formed by all the algebraic in- 
tegers of the field. In § 12.1 we defined an algebraic integer as any root 
of an equation 


(14.1.3) zite rii... Ate = 0, 

where ¢,,..., ¢; are rational integers. We appear then to have a choice 
in defining the integers of k(vm).We may say that a number & of 
k(Vm) is an integer of k(¥m) (i) if £ satisfies an equation of the form 
(14.1.3) for some j, or (ii) if € satisfies an equation of the form (14.1.3) 
with j = 2. In the next section, however, we show that the set of 
integers of k(vm) is the same whichever definition we use. 


14.2. Algebraic numbers and integers; primitive polynomials. 
We say that the integral polynomial 


(14.2.1) f(x) = Gy x"+a,2"1+-...40, 
is a primitive polynomial if 
a >9, (A; a,,..., &) = 1 


in the notation of p. 20. Under the same conditions, we call (14.1.1) 
a primitive equution. The equation (14.1.3) is obviously primitive. 

Tueorem 236. An algebraic number & of degree n satisfies a unique 
primitive equation of degree n. If é is an algebraic integer, the coefficient 
of x” in this primitive equation is unity. 

For n = 1, the first part is trivial; the second part is equivalent to 
Theorem 206. Hence Theorem 236 is a generalization of Theorem 206. 
We shall deduce Theorem 236 from 


THEOREM 237. Let ¢ be an algebraic number of degree n and let 
f(x) = 0 be a primitive equation of degree n satisfied by £. Let g(x) = 0 
be any primitive equation satisfied by é Then g(x) = f(x)h(x) for some 
primitive polynomial h(x) and all x. 

By the definition of € and n there must be at least one polynomial 


f(x) of degree n such that f (£) = 0. We may clearly suppose f(x) 
primitive. Again the degree of g(x) cannot be less than n. Hence we 
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can divide g(x) by f(x) by means of the division algorithm of elementary 
algebra and obtain a quotient H(x) and a remainder K(x), such that 
(14.2.2) g(a) = f(x)H(x) + K(x), 
H(x) and K(x) are polynomials with rational coefficients, and K(x) is 
of degree less than n. 

If we put x = £ in (14.2.2), we have K(é)= 0. But this is impossible, 
since € is of degree n, unless K(x) has all its coefficients zero. Hence 


g(x) = f(x)H(z). 
If we multiply this throughout by an appropriate rational integer, we 
obtain 
(14.2.8) cg(x) = f(x)h(x), 
where c is a positive integer and h(x) is an integral polynomial. Let d be 
the highest common divisor of the coefficients of h(x). Since g is primi- 
tive, we must have d c. Hence, if d > 1, we may remove the factor d; 
that is, we may take h(x) primitive in (14.2.3). Now suppose that p c, 
wherepisprime. It follows thatf(x)h(x) = 0 (modp)andso, by Theorem 
104 (i), either f(x) = 0 or h(x) = 0 (modp). Both are impossible for 
primitive f and hand so c = 1. This is Theorem 237. 

The proof of Theorem 236 is now simple. If g(x) = 0 is a primitive 
equation of degree n satisfied by é, then h(x) is a primitive polynomial 
of degree 0; i.e. h(x) = 1 and g(x) = f(x) for all x. Hencef(x) is unique. 

If é is an algebraic integer, then é satisfies an equation of the form 
(14.1.3) for somej =n. We write g(x) for the left-hand aide of (14.1.3) 
and, by Theorem 237, we have 


g(x) = f(x)h(x), 
where h(x) is of degree j-n. If f(x) = @gx"+-... and h(x) = hyai-"-+.. 
we have 1 = a, h,, and sO a, = 1. This completes the proof of 


Theorem 236. 


14.3. The general quadratic field &( vm). We now define the integers 
of k(vm) as those algebraic integers which belong to k(vm). We use 
‘integer’ throughout this chapter and Ch. XV for an integer of the 
particular field in which we are working. 

With the notation of § 14.1, let 


é= 


be an integer, where we may suppose that c > 0 and (a, b, c) = 
If b= o0, then = ajc is rational, c = 1, and £ = a, any rational integer. 


a +balm 
m 
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If b 40, € is quadratic. Hence, if we divide (14.1.2) through by c?, 
we obtain a primitive equation whose leading coefficient is 1. Thus 
c |2a and ¢? (a?—mb?). If d = (a,c), we have 


d a®, d cè, d (a?—mb?)— d? mb? >d b, 


since m has no squared factor. But (a, b, c) = 1 and so d = 1. Since 
c | 2a, we have c = 1 or 2. 


If c = 2, then a is odd and mb? = a? = 1 (mod 4), SO that bis odd 
and m = 1 (mod4). We must therefore distinguish two cases. 
(i) If m #1 (mod 4), then c = 1 and the integers of k(vm) are 


E =a-+bvm 


with rational integral a, b. In this case m = 2 or m = 3 (mod4). 

(ii) If m = 1 (mod 4), one integer of k(Vm) is 7 = 3(¥m—1) and all 
the integers can be expressed simply in térms of this 7, If c = 2, we 
have a and b odd and 


= ae = att bbe = a,+(2b,+1)r, 
where a, b, are rational integers. If c = 1, 


é= atbvm = a+b+2b6r = a,+2b,7, 


where a,, 6, are rational integers. Hence, if we change our notation 
a little, the integers of k(¥m) are the numbers a+br with rational 
integral a, b. 


Torm 238, The integers of ķk(vm) are the numbers 
a+bvm 
when m = 2 or m = 3’ (mod 4), und the numbers 
at+br = a+4b(v¥m—1) 
when m = 1 (mod 4), a and b being in either case rational integers. 


The field k(i)is an example of the first case and the field k J(= 3)) of 
the second. In the latter case 


T= —}+4+hiv3 = p 
and the field is the same as k(p). Ifthe integers of k(6) can be expressed 
as a+b¢, 
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where a and b run through the rational integers, then we say that [1, d| 
is a basis of the integers of k(6). Thus [1,2] is a basis of the integers of 
k(i), and [1, p] of those of k{,/(—3)}. 

14.4. Unities and primes. The definitions of divisibility, divisor, 
unity, and prime in k(vm) are the same as in k(i); thus q is divisible 
by £, or 8 |x, if there is an integer y of k(Wm) such that g = By.t Aunity 
eis a divisor of 1, and of every integer of the field. In particular 1 and 
-1 are unities. The numbers ® E are the associates of £, and a prime is 
a number divisible only by the unities and its associates. 

THEOREM 239. Ife, and e, are unities, then e} €, and eez are unities. 

There are a 8, and a 6, such that e; ô; = 1, e.5, = 1, and 

€,€, 6,5, = 1 > ae) 1. 
Hence eje, is a unity. Also 6, = lje, is a unity; and so, combining 
these results, @ i/e, is a unity. 

We call = r—svm the conjugate of £= r+svm. When m < 0, Ž is 
also the conjugate of ¢ in the sense of analysis, and £ being conjugate 
complex numbers; but when m > 0 the meaning is different. 

The norm Né of £ is defined by 

NE = E = (r-+svm)\(r—svm) = r?—ms. 
If £ is an integer, then N£ is a rational integer. If m = 2 or 3 (mod 4), 
and ¢ = a-+bvm, then Nee apd 
and if m = 1 (mod 4), and £ = a-+-bw, then 
Né = (a—}b)?—}mb?. 
Norms are positive in complex fields, but not necessarily in real fields. 
In any case N(éy) = NEN. 

THEOREM 240. The norm of a unity is +1, and every number whose 
norm is +lisa unity. 

For (a) e[l —> 6 =1—-+NeNS = 1 > Ne = +1, 
and (b) ¿éE = Né = +1 > €]1. 

Ifm <0, m = —p, then the equations 

a?+ub?= 1 (m =2,3 (mod 4)), 
(a—}4b)}-+łub? = 1 (m = 1 (mod 4) 
have only a finite number of solutions. This number is 4 in k(i), 6 in 
k(p), and 2 otherwise, since 
a = +1, b=0 
are the only solutions when p > 3. 


+ If x and 8 are rational integers, then y is rational, and go a rational integer, so that 
B |a then means the game in k{,/(—m)} as in k(1). 
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There are an infinity of unities in a real field, as we shall see in a 
moment in k(¥2). 
Né may be negative in a real field, but 


Mé = |NE| 
is a positive integer, except when € = 0. Hence, repeating the argu- 
ments of § 12.7, with MẸ in the place of Né when the field is real, we 
obtain 


THEOREM 241. An integer whose norm is a rational prime is prime. 

THEOREM 242. An integer, not 0 or a unity, can be expressed as a pro- 
duct of primes. 

The question of the uniqueness of the expression remains open. 

14.5. The unities of k(v2). When m = 2, 

Né = a?— 2? 
and a?—2b? = -1 
has the solutions 1, 1 and — 1, 1. Hence 
w= 1472, wt = —õ = —l+4v2 

are unities. It follows, after Theorem 239, that all the numbers 
(14.5.1) +w”, tw" (n = 0, 1,2,...) 
are unities. There are unities, of either sign, as large or as small as 
we please. 

THEOREM 243. The numbers (14.5.1) are the only unities of k(42). 

(i) We prove first that there is no unity « between 1 and w, If there 
were, we should have 

1 < etyv2= e< 14+v2 
and z?— 2y? = +1; 
so that -1 < a—yv2 < 1, 
O < 2x < 2472. 

Hence x = 1 and 1 < 1 +yv¥2< 1+ y2, which is impossible for jn- 
tegral y. 

Gi) If € > 0, then either e= w” or 

w < e< wrth 

for some integral n. In the latter case w-"e is a unity, by Theorem 239, 
and lies between ] and w, This contradicts (i); and therefore every 
positive ¢ is an w”, Since —e is a unity if eis a unity, this proves the 
theorem. 

6691 P 
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Since Nw = — 1, Nw? = 1, we have proved incidentally 
THEOREM 244. All rational integral solutions of 


x?—2y? = 1 
are given by atyv2 = +£(14+2)?, 
and all of z?— 2y? = —] 
by a+yV2 = +(1-+v2)"+, 
with n a rational integer. 
The equation x?—my?* = 1, 


where m is positive and not a square, has always an infinity of solutions, 
which may be found from the continued fraction for vm. In this case 
1 1 
v2 = 14+—— —— 
FIF oy...’ 
the length of the period is 1, and the solution is particularly simple. 
If the convergents are 


(n = 0, 1, 2,...) 


then Pp In, and 
Pn = Patan v2, Pn = Pnn v2 
are solutions of Ea = lp- F tn- 
From $o = w, $i = w, o = -we-l, p= wr, 
and 
Wn = aihn, (a) = Aw) (— o), 
it follows that Pn == wrt, A = (—w)-™1 


for all n. Hence 
Pa = Hona) = HHNH (1 V2)", 
dy = EN%w—(—o)-*3} = pV (1— 92), 


and Pagi = nbn = (—1)P4. 
The convergents of odd rank give solutions of x?—2y? = 1 and those 
of even rank solutions of g?— 2y? = — 1. 

If 4?—2y? = 1 and z/y > 0, then 


0< Ponpa = < < ea 

y= ety) <i Bye < By 

Hence, by Theorem 184, z/y is a convergent. The convergents also give 

all the solutions of the other equation, but this is not quite SO easy to 

prove. In general, only some of the convergents to ym yield unities of 
k(Wm). 
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14.6. Fields in which the fundamental theorem is false. The 
fundamental theorem of arithmetic is true in k(l), k(i), k(p), and 
(though we have not yet proved so) in k(v2). It is important to show by 
examples, before proceeding farther, that it is not true in every k(vm). 
The simplest examples are m = = 5 and (among real fields) m = 10. 

:(i) Since -5 =3 (mod 4), the integers of k{,/(—5)} are a+b,/(— 
It is easy to verify that the four numbers 


2, 3, I+y(—6), 1-y(- 
are prime. Thus 


1+(—5) = {a+b,(—5)e+d,(— 
implies 6 = (a?+5b?)(c?-+5d?); 
and a?-}-56? must be 2 or 3, if neither factor is a unity. Since neither 


2 nor 3 is of this form, 1+ J(-5) is prime; and the other numbers may 
be proved prime similarly. But 


6 = 23 = {14/5} 

and 6 has two distinct decompositions into primes. 

(ii) Since 10 = 2 (mod 4), the integers of k(V10) are a+6v10. In 
this, See 6 = 2.3 = (44¥10)(4—V10), 


and it is again easy to prove that al] four factors are prime. Thus, for 


exemple, 2 = (a+bv10)(c-+dv10) 

implies 4 = (a?— 106?)(c?— 10¢?), 

and @2— 106? must be +2, if neither factor is a unity. This is impossible 
because neither of +2 is a quadratic residue of 10. 

The falsity of the fundamental theorem in these fields involves the 
falsity of other theorems which are central in the arithmetic of k(1). 
Thus, if œ and f are integers of k(t), without a common factor, there 
are integers A and p for which 


oA+Bu = 1. 


This theorem is false in kf —5)}. Suppose, for example, that q and g 
are the primes 3 and 1+4/{—5). Then 


3{a-+b,/(—5)}4+ {1+./(—5)He+dy(—5)} = 1 
involves 3a+c—5d = 1, 3b+ctd = A 
and so 3a—3b— 6d = 1, 


which is impossible. 


ŢI, 23, 32, 42, 52, 6%, 72, 82, 97 = 1, 4, 9, 6, 5, 6, 9. 4, 1 (mod 10). 
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14.7. Complex Euclidean fields. A simple field is a field in 
which the fundamental theorem is true. The arithmetic of simple 
fields follows the lines of rational arithmetic, while in other cases a new 
foundation is required. The problem of determining all simple fields is 
very difficult, and no complete solution has been found, though Heil- 
bronn has proved that, when m is negative, the number of simple fields 
is finite. 

We proved the fundamental theorem in A(i) and k(p) by establishing 
an analogue of Euclid’s algorithm in k(1). Let us suppose, generally, 
that the proposition 


(Œ) given integers y and y,, with y, = 0, then there is an integer k such 


ga Y =t» Ny) < Nyl 
is true in k(vm). This is what we proved, for k(i) and k(p), in Theorems 
216 and 219; but we have replaced Ny by |Ny| in order to include real 
fields. In these circumstances we say that there is a Euclidean algorithm 
in k(Wm), or that the field is Euclidean. 

We can then repeat the arguments of §§ 12.8 and 12.9 (with the sub- 
stitution of |Ny| for Ny), and we conclude that 


Tazorem 245. The fundamental theorem is true in any Euclidean 


quadratic field. 


The conclusion is not confined to quadratic fields, but it is only in 
such fields that we have defined Ny and are in a position to state it 
precisely. 

(E) is plainly equivalent to 


Œ) given any 6 (integral or not) of k( vm), there is an integer x such that 


(14.7.1) |N(8—x)| < 1’. 

Suppose now that 8 = rtsvm, 

where r and s are rational. If m #1 (mod 4) then 
k = x+yvm, 

where x and y are rational integers, and (14.7.1) is 

(14.7.2) \(r—x)?—m(s—y)*| < 1. 


If m = 1 (mod4) then 

K = ety py(vm—l) = at dythyvi,t 
where x and y are rational integers, and (14.7.1) is 
(14.7.3) \(r—x—4y)?—m(s—hy)*| < 1. 


{ The form of § 14.3 with 7+, y for a, b. 
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When m = —p < Q, it is easy to determine all fields in which these 
inequalities can be satisfied for any r, s and appropriate x, y. 


THEOREM 246. There are just five complex Euclidean quadratic fields, 
viz. the fields in which 
m = -l, -2, -3, -7, -11. 
There: are two cases. 
(i) When m Æ 1 (mod 4), we take r = 3, § = $ in (14.7.2); and we 
require body <1, 
or p < 3. Hence u = 1 and p = 2 are the only possible cases; and in 


these cases we can plainly satisfy (14.7.2), for any r and s, by taking 
x and y to be the integers nearest to r and s. 


(ii) When m = 1 (mod 4) we take r = 4, s =} in (14.7.3). We require 
fetish <i) 


Since p = 3 (mod 4), the only possible values of u are 3, 7, 11. Given 
s, there is a y for which \2s—y| < 4, 


and an x for which jr—x—hy| < se 

and then \(r—x—hyy—m (6—39) I <}+ = #<1. 

Hence (14.7.3) can be satisfied when p has one of the three values in 
question. 

There are other simple fields, such as k{y( — 19) and k{y( —43)}, which 
do not possess an algorithm; the condition is sufficient but not necessary 
for simplicity. The fields corresponding to 

m = -l, -2, -3, -7, --11, -19, -43, -67, -163 
are simple, and Heilbronn and Linfoot have proved that there is at 
most one more. Stark has proved that for this field (if it exists) 
m < —exp(2-2X 10’) 
but its existence is highly improbable. 


14.8. Real Euclidean fields. The real fields with an algorithm are 
more numerous and it is only very recently that they have been com- 
pletely determined. 

THEOREM 247,.* k(vm) is Euclidean when 

m = 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73 
and for no other positive m. 

We can plainly satisfy (14.7.2) when m = 2 or m = 3, since we can 

choose x and y so that |r—a| < } and |s—y| < }. Hence k(v2) and 
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k(v3) are Euclidean, and therefore simple. We cannot prove Theorem 
247 here, but we shall prove 


Tueorem 248. k(vm) is Euclidean when 
m = 2, 3, 5, 6, 7, 13, 17, 21, 29. 
If we write 
A=0, n=m_  (m #1 (mod 4)), 
A= 4, n=3m (m =1 (mod 4)), 
and replace 2s by s when m = 1, then we can combine (14.7.2) and 
(14.7.3) in the form 
(14.8.1) \(r—x—Ay)?—n(s—y)*| < 1. 

Let us assume that there is no algorithm in k(vm). Then (14.8.1) is 
false for some rations1 r,s and all integral x, y; and we may suppose 
thatt 
(14.8.2) O0<r<hk O0<s8<h. 

There is therefore a pair 7, s, satisfying (14.8.2), such that one or other of 
[Pla,y)] (r—a—dy)? > 1+-n(s—y)?, 
[N (a, y)] n(s—y)? > 1+(r—x—dy)? 
is true for every x, y. The particular inequalities which we shall use are 
[P(0, 0)] r? > 1+ns?, [V(0, 0)] ns? > 1+r, 
[P(1,0)}  (1—r} > 1+ns?, [N(1,0)] ne > 1+(1—r)?, 
[P(—1,0)] (ltr)? 3 1+ 7s?, [N(—1,0)] ns? > 1+(1+r)?. 


t This is very easy to see when m1 (mod 4) and the left-hand side of (14.8.1) is 
|(r-2P—m(s—y)| ; 
for this is unaltered if we write 
Qr+u, ertu, estu, ey +v, 
where @ 1 and €, are each | or — 1, and y and v are integers, for 
T, T, 8, yi 

and we can always choose e1, €2,u, v so that & r +u and ez8 +v lie between Qand } inclusive. 

The situation is a little more complex when m = 1 (mod 4) and the left-hand side 
of (14.8.1) is 

|r—s— ły)? —łm(s—y)?]. 

This is unaltered by the substitution of any of 

(1) artu, aztu, 8, ay, 

(2) r, xv—v, stw, y+2, 

(3) 7, FY —8, —yY, 

(4) 4-7, =g, :1~s, l-y, 
for r,x, 8, y. We first use (1) to make O <r <<}; then (2) to make =] <$ < 1; and 
then, if necessary, (3) to make O <e<l. If then 0 <8 < 4, the reduction is completed. 
If }<s <1, we end by using (4), as we can do because $—7 lies between O and } if 
r does so. 
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One at least of each of these pairs of inequalities is true for some y and 
s satisfying (14.8.2). If r = s = 0, P(O,O) and N(0,0) are both false, 
so that this possibility is excluded. 

Since r and s satisfy (14.8.2), and are not both 0, P(0, 0) and Pd, 0) 
are false; and therefore N(O, 0) and Nd, 0) are true. If P(-1, 0) were 
true, then N(1,0) and P( = 1,0) would give 


(Ir)? > l+ns* > 2+(1—r)? 
and so 4r > 2. From this and (14.8.2) it would follow that r = 4 and 
ns? = $, which is impossible.t Hence P(-1, 0) is false, and therefore 
N( —1,0)is true. This gives 
ns? > 1+(1-+r)? > 2, 

and this and (14.8.2) give n 28. 

It follows that there is an algorithm in all cases in which n < 8, 
and these are the cases enumerated in Theorem 248. 

There is no- algorithm when m = 23. Take r = 0, s = ¥. Then 


(14.8.1) is |23a2— (23y—7)2| < 23. 
Since é = 232?—(23y—7)? = -49 = -3 (mod 23), 
é must be -3 or 20, and it is easy td see that each of these hypotheses 
is impossible. Suppose, for example, that 
é = 23X?—Y? = -3. 

Then neither X nor Y can be divisible by 3, and 

X’? =1, Y=1, €=22=1 (mod 3), 
a contradiction. 


The field k(V23), though not Euclidean, is simple; but we cannot prove 
this here. 


14.9. Real Euclidean fields (continued). It is naturally more diffi- 
cult to prove that k(vm) is not Euclidean for all positive m except those 
listed in Theorem 247, than to prove k(vm) Euclidean for particular 
values of m. In this direction we prove only 


THEOREM 249. The number of real Euclidean fields k(vm), where 
m =2or 3 (mod 4), is finite. 


t Suppose that s = p/q, where (p, q) = 1. If m # 1 (mod 4), then m = n and 
4mp? = 5g. 
Hence p? |5, so that p = 1; and g?|4m. But m has no squared factor, and 0 < 8 < }. 
Hence q = 2, s = }, and m = 5 = 1 (mod 4), a contradiction. 
If m = 1 (mod 4), then m = 4n and 
mp! = 59’, 
From this we deduce p= 1, q = 1, 5 = 1, in contradiction to (14.8.2). 
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Let ug suppose k(V¥m) Euclidean and m Æ 1 (mod4). We take y = 0 
and s = t/m in (14.7.2), where ¢ is an integer to be chosen later, Then 
there are rational integers x, y such that 


2 
amy) <l, |(my—t)?—m2?| < m. 
Since (my—t)?—mz? = t? (modm), 
there are rational integers x, z such that 
(14.9.1) 2?—mzx? = t? (modm), \z2@—ma?| < m. 


If m = 3 (mod 4), we choose ż an odd integer such that 
5m < £ < 6m, 


as we certainly can do if m is large enough. By (14.9.1), 22—mz? is 
equal to ¢— 5m or to {?—6m, SO that one of 


(14.9.2) P—2? = m(5—2?), —2? = m(6—z?) 
is true, But, to modulus 8, 
2 =1, z2, z? =0, lor 4 mÆ=3or7; 


(2 —z? = 0, 1, or 5, 
5—2? = 1, 4, or 5; 6—22 = 2, 5, or 6; 
m(5—2?) = 3, 4, or 7; m(6—2?) = 2, 3, 6, or 7; 
and, however we choose the residues, each of (14.9.2) is impossible. 


If m = 2 (mod 4), we choose ¢ odd and such that 2m < {2 < 3m, as 
we can if m is large enough. In this case, one of 


(14.9.3) P—2? = m(2—zr?), P—z2 = m(3—2)? 
is true. But, to modulus 8, m = 2 or 6: 
2—a? = 1, 2, or 6; 3—2? = 2, 3, or 7; 


m(2—x*) = 2, 4, or 6; + m(3—a*) = 2, 4, or 6; 
and each of (14.9.3) is impossible. 
Hence, if m = 2 or 3 (mod 4) and if m is large enough, k(vm) cannot 
be Euclidean. This is Theorem 249. The same is, of course, true for 
m = 1, but the proof is distinctly more difficult. 


NOTES ON CHARTER XIV 


§§ 141-6. The theory of quadratic fields is developed in detail in Bachmann’s 
Grundlehren der neueren Zahlentheorie (Géschens Lehrbiicherei, no. 3, ed. 2, 1931) 
and Sommer’s Vorlesungen über Zahlentheorie, There is a French translation of 
Sommer’s book, with the title Introduction a læ théorie des nombres algébriques 
(Paris, 1911); and a more elementary account of the theory, with many numerical 
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examples, in Reid’s The elementa of the theory of algebraic numbers (New York, 
1910). 

§ 14.5. The equation xt—my? = 1 is usually called Pell’s equation, but this 
is the result of a misunderstanding. See Dickson, History, ii, ch. xii, especially 
pp. 341, 351, 354. There is a very full account of the history of the equation in 
Whitford’s The Pell equation (New York:, 1912). 

§ 14.7. The work of Heilbronn and Linfoot referred to will be found in the 
Quarterly Journal of Math. (Oxford), 5 (1934), 150-60 and 293-301. Stark's result 
[Trang. Amer. Math. Soc. 122 (1966), 112-9] is an improvement of Lehmer’s 
that m > —§.10°, 

§ 14.8-9. Theorem 247 is essentially due to Chatland and Davenport [Canadian 
Journal of Math. 2 (1950), 289-96]. Davenport [Proc. London Math. Soc. (2) 
53 (1951), 65-82] showed that k(vm) cannot be Euclidean if m > 314 — 16384, 
which reduced the proof of Theorem 247 to the study of a finite number of values 
of m. Chatland (Bulletin Amer. Math. Soc. 55 (1949), 948-53] gives a list of 
references to previous results, including a mistaken announcement by another 
that (497) was Euclidean. Barnes and Swinnerton-Dyer [Acta Math. 87 (1952), 
259-323] show that k(V97) is not, in fact, Euclidean. 

Our proof of Theorem 248 is due to Oppenheim, Math. Annalen, 109 (1934), 
349-52, and that of Theorem 249 to E. Berg, Fysiogr. Sdllsk, Lund. Férh, 5 (1935), 
1-6. 

The problem of determining all m for which h(V¥m) is simple is very much more 
difficult and so far unsolved. 


XV 
QUADRATIC FIELDS (2) 


15.1. The primes of k(i). We begin this chapter by determining 
the primes of k(i) and a few other simple quadratic fields. 
If 7 is a prime of k(vm), then 
a|Na = nī 
and m |Nz|. There are therefore positive rational integers divisible 
by a, If zis the least such integer, z = 2,2, and the field is simple, 


then 
| 22%, —> 7|z Or | 20, 


a contradiction unless z, or zis 1. Hence z is a rational prime. Thus 
m divides at least one rational prime p. If it divides two, say p and p’, 


men nip. w|p' > | px—p'y = 1 


for appropriate x and y, a contradiction. 
THEOREM 250. Any prime n of a simple field k(Wm) is a divisor of just 
one positive rational prime. 


The primes of a simple field are therefore to be determined by the 
factorization, in the field, of rational primes. 
We consider (i) first. If 


m= a+bi|p, mà=p, 


then Na NÀ = p?. 
Either NÀ = 1, when Ais a unity and 7 an associate of p, or 
(15.1.1) Nr = a?+6? = p. 


(i) If p = 2, then 
p = +1? = (14i)(1—i) = i(1—1). 

The numbers 1+i, —l+5i, -l-, 1—i (which are associates) are 
primes of kù. 

(ii) If p = 4n+3, (15.1.1) is impossible, since a square is congruent 
to 0 or 1 (mod4). Hence the primes 4n-+-3 are primes of h(i). 
Gii) Ifp = 4n+1, then (=) = l, 
P? 


by Theorem 82, and there is an x for which 
Pl@t1,  Pl(e+i)(e—i). 
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Ifp were a prime of k(i), it would divide x-+ or x-i, and this is false, 
since the numbers Se av 
aS ae 
P P 
are not integers. Hence p is not a prime. It follows that p = 7A, where 
m = a+bi, X = a-bi, and 
Nr = @-b = p. 

In this case p can be expressed as a sum of two squares. 

The prime divisors of p are 
(151.2) T, in, —T, —im, À, tA, —A, — iÀ, 
and any of these numbers may be substituted for 7. The eight varia- 
tions correspond to the eight equations 
(151.3) (+a) +(+0} = (+b)+(+a} = p. 
And if p = ¢?+d? then ¢c4-id |p, so that c+id is one of the numbers 
(15.1.2). Hence, apart from these variations, the expression of p as a 
sum of squares is unique. 

Tuzorem 251. A rational prime p =4n+ 1 can be expressed as a sum 
a?+-b? of two squares. 

Tuzorem 252. The primes of k(i) are 

(1) 1 +i and its associates, 

(2) the rational primes 4n-+3 and their associates, 

(3) the factors a+-bi of the rational primes 4n+ 1. 


15.2. Fermat’s theorem in (i). As an illustration of the arith- 
metic of k(i), we select the analogue of Fermat’s theorem. We consider 
only the analogue of Theorem 71 and not that of the more general 
Fermat-Euler theorem. It may be worth repeating that y (a—f) and 

a = f (mod y) 
mean, when we are working in the field k(9), that «—f = xy, where 
kis an integer of the field. 

We denote rational primes 4%-+ 1 and 4n+3 by p and q respectively, 
and a prime of k(i) by 7, We confine our attention to primes of the 
classes (2) and (3), i.e. primes whose norm is odd; thus 7 is ag or a 
divisor of a p. We write 


lr) = Na—l, 
d(t) = p—1 (mip), o(7) = @—-1 (7 =q. 
Torm 2.53. If (a, m)= 1, then 
afm = 1 (mod 7). 


so that 
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Suppose that « = l+im. Then, when 7 p, i? =t and 
a? = (l+im)? = P+ (im)? = P+im? (modp), 
by Theorem 75; and so 
a? = 1l+im = «a (modp), 

by Theorem 70. The same congruence is true modz, and we may 
remove the factor q, 

When 7 = q, %8 = —i and 

af = (lt+im)? = 8—im? = l—im = à (modq). 
Similarly, a% = œ, SO that 
a =a, of-1 = 1 (modg). 

The theorem can also be proved on lines corresponding to those of 

§ 6.1. Suppose for example that 7 = a+bi |p. The number 
(a+bi)(c+dt) = ac—bd+i(ad+6bc) 
is a multiple of 7 and, since (a, b) = 1, we can choose c and d so that 
ad+bc = 1. Hence there is an s such that 
Tr|s-++2. 
Now consider the numbers 
r= 0, 1, 2, . ..> Na—1l=a?+b?—1, 
which are plainly incongruent (mod r). If x+yt is any integer of k(i), 
there is an r for which 
x-sy = r (mod Nr); 

and then ety = y(s+i)+r = r (mod r). 
Hence the r form a ‘complete system of residues’ (modn). 

If q is prime to 7, then, as in rational arithmetic, the numbers œr also 
form a complete system of residues.t Hence 

TI («r) = I] r (modz), 

and the theorem follows as in § 6.1. 

The proof in the other case is similar, but the ‘complete system’ is 
constructed differently. 


15.3. The primes of k(p). The primes of k(p) are also factors of 
rational primes, and there are again three cases. 
(1) If p = 3, then 
p = (1—p)(1—p?) = (1+p)(1—p)? = —p?(1—p)?. 
By Theorem 221, 1 —p is a prime. 


{ Compare Theorem 58. The proof jg essentially the same. 
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(2) If p = 2 (mod 3) then it is impossible that Nz = p, since 
4Nr = (2a—b)?+ 3b? 
is congruent to 0 or 1 (mod3). Hence p is a prime in k(p). 
(3) If p = 1 (mod8) then 
S- 


P 
by Theorem 96, and p g?+3. It then follows as in § 15.1 that p is 
divisible by a prime 7 = a+bp, and that 
p= Nn = a—ab+6?. 

THEOREM 254. A rational prime 3n+1 is expressible in the form 
a®?—ab+b?. 

THEOREM 255. The primes of k(p) are 

(1) l-p and its associates, 

(2) the rational primes 3n+2 and their associates, 

(3) the factors a+-bp of the rational primes 3n-+ 1. 

15.4. The primes of &(v2) and k(v5). The discussion goes similarly 
in other simple fields. In &(¥2), for example, either p is prime or 
(15.4.1) : Nr = @— 2b = +p. 

Every square is congruent to 0, 1, or 4 (mod 8), and (15.4.1) is impossible 
when p is 8n+3. When p is 8n-+-1, 2 is a quadratic residue of p by 
Theorem 95, and we show as before that p is factorizable. Finally 
2 = (v2), 
and y2 is prime. 
TuEeorem 256. The primes of k(W2) are (1) V2, (2) the rational primes 


8n+3, (3) the factors a+bv2 of rational primes 8n+-1 (and the associates 
of these numbers). 


We consider one more example because we require the results in 
§ 15.5. The integers of k(V5) are the numbers a-}-bw, where a and b 
are rational integers and 


(15.4.2) w= 4( L-45). 
The norm of a+ bw is a?+ab—b?. 

The numbers 

(15.4.3) tot” (n = 0,1,2,...) 


are unities, and we can prove as in § 14.5 that there are no more. 
The determination of the primes depends upon the equation 


Nr = a?+ab—b? = p, 
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or (2a+-b)?— 5b? = 4p. 
If p = 5n+2, then (2a-+6)? = +3 (mod 5), which is impossible. Hence 


these primes are primes in k(v5). 


Ifp = 5n+1, then (2) =1, 
P 
by Theorem 97. Hence p | (x?—5) for some x, and we conclude as before 
that p is factorizable. Finally 
5 = (v5)? = (2w—1)*. 

Tueoreém 257. The umities of k(v5) are the numbers (15.4.3). The 
primes are (1) V5, (2) the rational primes 5n+ 2, (3) the factors a+bw of 
rational primes 5n-+ 1 (and the associates of these numbers). 


We shall also need the analogue of Fermat’s theorem. 


Torm 258. If p and q are the rational primes 5n-land 5n-+2 
respectively; b(z) = |N7|—_ 1, so that 
pm) = p—l (mip) (m7) = -1 (r=); 
and (a, m) = 1; then 


(15.4.4) ad) = 1 (mod 7), 
(15.4.5) aP-1 = 1 (modn), 
(15.4.6) altl = Na (modq). 
Further, If 7 p, # is the conjugate of n, (a,n) = 1 and (a, 7%) = 1, then 
(15.4.7) a?-1 = 1 (modp). 
First, if 2x = cds, 
then 2a? = (2x)? = (c+dv5)? = cP +dP5*-DV5 (modp). 
But 5iP-) = () = 1 (mod p), 
c? =c and d? = d. Hence 
(15.4.8) 2a” = c+dv5 = 2a (modp), 
and a fortiori 
(15.4.9) 2a? = 2a (mod n). 


Since (2, m) = 1 and (01, 7) = 1, we may divide by 2a, and obtain 
(15.4.5). If also (a,z) = 1, so that (a,p) = 1, then we may divide 
(15.4.8) by 2x, and obtain (15.4.7). 

Similarly, if q > 2, 


(15.4.10) 2o% = c—dv5 = 24, af = à (modq), 
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(15.4.11) altl = gg = Na (modgq). 
This proves (15.4.6). Also (15.4.10) involves 
af = af = a (modq), 

(15.4.12) of -1 = 1 (mod gq). 
Finally (15.4.5) and (15.4.12) together contain (15.4.4). 

The proof fails if q = 2, but (15.4.4) and (15.4.6) are still true. If 
a =e +fw then one of e and f is odd, and therefore Na = e +- ef —f? is 
odd. Also, to modulus 2, 

a? = +f w = ep fw? = et f(wtl]) =etf(l—w) = e+fo = a 
and a = aà = Na =1. 


We note in passing that our results give incidentally another proof of Theorem 
180. 


The Fibonacci number is 
w"—@" w'— qm" 
gee aa 
where w is the number (15.4.2) and @ = — 1/w is its conjugate. 
If n = p, then 


w?! = 1 (modp), @?-1 = 1 (modp), 
Upp V5 = wP-1—-@?-) = 0 (modp), 
and therefore Uy = 0 (modp). If m= q, then 
wt 3 Nw, ot = Nw (mod q), 
Ugy, V5 = 0 (modq) 
and %,, = 0 (modq). 

15.5. Lucas’s test for the primality of the Mersenne number 
Mın We are now in a position to prove a remarkable theorem which 
is due, in substance at any rate, to Lucas, and which contains a neces- 
sary and sufficient condition for the primality of Min +3" Many ‘necessary 
and sufficient conditions’ contain no more than a transformation of a 
problem, but this one gives a practical test which can be applied to 
otherwise inaccessible examples. 

We define the sequence 

11, Tg, Tq). = 3, 7, 47,.., 
by Tm = wt Be", 
where w is the number (15.4.2) and & = —1/w. Then 
Ting = Tig 2. 
In the notation of § 10.14, Tm = Vom 
No two rp have a common factor, since (i) they are all odd, and 

(ii) tm =O Pty = -2 >r, =2 (v> m+)), 

to any odd prime modulus. 
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THEOREM 259. If p is a prime 4”+3, and 
M = M = 2°-1 
is the corresponding Mersenne number, then M is prime if 
(15.51) 0 (mod M), 
and otherwise composite 
(1) Suppose M prime. Since 
M = 8.16"—] = 8—1 =2 (mod5), 
we may take a= w, q =M in (15.4.6). Hence 
w = vă = Nw = -1 (mod M), 
Py = 0” (w+ 1) = 0 (mod M), 


a 


which is (155.1). 

(2) Suppose (15.5.1) true. Then 

wt 1 = wr? ce = 0 (mod M), 

(15.52) w? = -1 (mod M), 
(15.5.3) w?” = 1 (mod M). 
The same congruences are true, a fortiori, to any modulus y which 
divides M. 

Suppose that M -= Dy Po Vor 
is the expression of M as a product of rational primes, p; being a prime 
5n+1 (so that p; is the product of two conjugate primes of the field) 
and g;a prime 5n+2. Since M = 2 (mod 5), there is at least one q,. 

The congruence wt = 1 (modz), 
or P(x), is true, after (15.5.8), when x = 2711, and the smallest positive 
solution is, by Theorem 69, a divisor of 2?+!, These divisors, apart 
from 2?+1, are 2P, 2?-1,..., and P(x) is false for all of them, by (15.5.2). 
Hence 27+! is the smallest solution, and every solution is a multiple of 
this one. 

But wet = 1 (mod p)), 

w+) = (Nw)? = 1 (modq,), 
by (15.4.7) and (15.4.6). Hence »,—1 and 2(g;+1) are multiples of 
p+1 d : 
ee j= PAH, 
q; = 2?k;—1, 

for some h, and k,. The first hypothesis is’impossible because the right- 
hand side is greater than M;and the second is impossible unless 


nate: k=l = M. 
Hence M is prime. 
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The test in Theorem 259 apphes only when p = 3 (mod 4). The 
sequence 4, 14, 194,... 
(constructed by the same rule) gives a test (verbally identical) for anyp. 
In this case the relevant field is k(v3). We have selected the test in 
Theorem 259 because the proof is slightly simpler. 

To take a trivial example, suppose p = 7, M, = 127. The numbers 
Tm of Theorem 259, reduced (mod M), are 

3, 7, 47, 2207 = 48, 2302 = 16, 254 = 0, 

and 127 is prime. Ifp = 127, for example, we must square 125 residues, 
which may contain as many as 39 digits (in the decimal scale). Such 
computations were, until recently, formidable, but quite practicable, 
and it was in this way that Lucas showed M,., to be prime. The construc- 
tion of electronic digital computers has enabled the tests to be applied 
to M, with larger p. These computers usually work in the binary scale 
in which reduction to modulus 2” -- 1 is particularly simple. But their 
great advantage is, of course, their speed. Thus M,,, was tested in about 
a minute by SWAC and M,» in about an hour. Each minute of this 
machine’s time is equivalent to more than a year’s work for someone 
using a desk calculator. 


15.6. General remarks on the arithmetic of quadratic fields. 
The construction of an arithmetic in a field which is not simple, like 
k{J( -5)) or k(v10), demands new ideas which (though they are not 
particularly difficult) we cannot develop systematically here. We add 
only some miscellaneous remarks which may be useful to a reader who 
wishes to study the subject more seriously. 

We state below three properties, A, B, and C, common to the ‘simple’ 
fields which we have examined. These properties are all consequences 
of the Euclidean algorithm, when such an algorithm exists, and it was 
thus that we proved them in these fields. They are, however, true in 
any simple field, whether the field is Euclidean or not. We shall not 
prove SO much as this; but a little consideration of the logical relations 
between them will be instructive. 

A. If x and B are integers of the field, then there is an integer 8 with 
the properties 


(Ai) ô |a, ô |8, 
and 
(A ii) òla. è |E — ò, ò. 


Thus 8 is the highest, or ‘most comprehensive’, common divisor (a, £) 
of q and f, as we defined it, in h(i), in § 12.8. 
5581 Q 
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B. If wand ß are integers of the jield, then there is an integer § with 
the properties 


(Bi) èla, 3 |B: 
and (B ii) 8 is a linear combination of x and B; there are integers À and p 
such that atub = 8. 


It is obvious that B implies A; (B i) is the same as (A i), and a ô with 
the properties (B i) and (B ii) has the properties (A i) and (A ii). The 
converse, though true in the quadratic fields in which we are interested 
now, is less obvious, and depends upon the special properties of these 
fields. 

There are ‘fields’ in which ‘integers’ possess a highest common divisor in sense 
A but not in sense B. Thus the aggregate of all rational functions 

P(x,y) 
ROY = Oa.) 
of two independent variables, with rational coefficients, is a field in the sense 
explained at the end of § 14.1. We may call the polynomials P (x,y) of the field 
the ‘integers’, regarding two polynomials as the same when they differ only by 
a constant factor, Two polynomials have a greatest common divisor in sense A; 
thus x and y have the greatest common divisor 1. But there are no polynomials 
P(x, y) and Q(z, y) such that 


eP(2,y)+yQ(xy) = 1. 
C. Factorization in the jield is unique: the jield is simple. 
It is plain that B implies C; for (B i) and (B ii) imply 


ôy lay, Sy By, dAay+uBy = dy, 
and so 


(15.6.1) (ay, By) = èy; 
and from this C follows as in § 12.8. 

That A implies C is not quite SO obvious, but may be proved as 
follows. It is enough to deduce (15.6.1) from A. Let 

(wy, By) = A, 
Then dla. jB — dylay. dy! By, 
and so, by (A ii), dy | A. 
Hence A= dbyp, 
say. But A ay, A | Py and so 
Spl«, 8p |B; 

and hence, again by (A ii), 5p | 8. 
Hence p is a unity, and A = òy. 

On the other hand, it is obvious that C implies A; for 6 is the product 
of all prime factors common to « and ĝ. That C implies B is again less 
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immediate, and depends, like the inference from A to B, on the special 
properties of the fields in question.? 


15.7. Ideals in a quadratic field. There is another property 
common to all simple quadratic fields. To fix our ideas, we consider 
the field k(t), whose basis (§ 14.3) is [1,7]. 

A lattice A isf the aggregate of all points|| 

ma-+nB, 
a and ĝ being the points P and Q of § 3.5, and m and n running through 
the rational integers. We say that [«,ß] is a basis of A, and write 


A = fo, f]; 
a lattice will, of course, have many different bases. The lattice is a 
modulus in the sense of § 2.9, and has the property 


(15.7.1) pEA.cEA — mp+noeA 


for any rational integral m and n. 

Among lattices there is a sub-class of peculiar importance. Suppose 
that A has, in addition to (15.7.1), the property 
(15.7.2) yeA => iye. 
Then plainly my € A and niy € A, arid so 

yeA> Bye A 

for every integer p of k(i); all multiples of points of A by integers of X(7) 
are also points of A. Such a lattice is called an ideal. If A is an ideal, 
and p and g belong to A, then pp+vo belongs to A: 
(15.7.3) pEA.cEA = pptved 
for all integral u and Y. This property includes, but states much more 
than, (15.7.1). 

Suppose now that A is an ideal with basis [«, 8], and that 


(a, B) = 6 P 

Then every point of A is a multiple of 5. Also, since ô is a linear com- 
bination of « and £f, and all its multiples are points of A. Thus A is 
the class of all multiples of 6; and it is plain that, conversely, the class 
of multiples of any ô is an ideal A. Any ideal is the class of multiples of 
an integer of the field, and any such class is an ideal. 

+ In fact both inferences depend on just those arguments which are required in the 
elements of the theory of ideels in a quadretic field. 

ł See § 3.5. There, however, we reserved the symbol A for the principal lattice. 


|, We do not distinguish between a point and the number which is its affix in the 
Argand diagram. 
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If A is the class of multiples of p, we write 


A = {p}: 
In particular the fandamental ïattice, formed by all the integers of the 
field, is {1}. 
The properties of an integer p may be restated as properties of the 
ideal {p}. Thus ¢ p means that {p} is a part of {o}. We can then say 
that ‘{p} is divisible by {o} , and write 


{o} | {o}. 


{o} lp, p=O (mod {o}), 
these assertions meaning that the number p belongs to the ideal {o}. 
In this way we can restate the whole of the arithmetic of the field in 
terms of ideals, though, in k(i), we gain nothing substantial by such a 
restatement. An ideal being always the class of multiples of an integer, 
the new arithmetic is merely a verbal translation of the old one. 

We can, however, define ideals in any quadratic field. We wish to 
use the geometrical imagery -of the complex plane, and we shall there- 
fore consider only complex fields. 

Suppose that k(vm) is a complex field with basis [1,w].f We may 
define a lattice as we defined it above in k(i), and an ideal as a lattice 
which has the property 
(15.7.4) yeA > wye 
analogous to (15.7.2). As in k(i), such a lattice has also the property 
(15.7.3), and this property might be used as an alternative definition 
of an ideal. 

Since two numbers q and £ have not necessarily a ‘greatest common 
divisor’ we can no longer prove that an ideal r has necessarily the form 
{p}; any {p} is an ideal, but the converse is not generally true. But the 
definitions above, which were logically independent of this reduction, 
are still available; we can define 


Or again we can write 


s|r 
as meaning that every number of r belongs to s, and 
p = 0 (mods) 
as meaning that p belongs to s. We can thus define words like divisible, 
divisor, and prime with reference to ideals, and have the foundations 
for an arithmetic which is at any rate as extensive as the ordinary arith- 
metic of simple fields, and may perhaps be useful where such ordinary 
t w= ~m when m Æ 1 (mod 4). 
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arithmetic fails. That this hope is justified, and that the notion of an 
ideal leads to a complete re-establishment of arithmetic in any field, is 
shown in systematic treatises on the theory of algebraic numbers. The 
reconstruction is as effective in real as. in complex fields, though not all 
of our geometrical language is then appropriate. 


An ideal of the special type {p} is called a principal ideal; and the 
fourth characteristic property of simple quadratic fields, to which we 
referred at the beginning of this section, is 

D. Every ideal of a simple field is a principal ideal. 

This property may also be stated, when the field is complex, in a 
simple geometrical form. In kG) an ideal, that is to say a lattice with 
the property (15.7.2), is square; for it is of the form {p}, and may be 
regarded as the figure of lines based on the origin and the points p 
and ip. More generally 

E. Ifm <0 and k(Vm) is simple, then every ideal of k(vm) is a lattice 
similar in shape to the lattice formed by all the integers of the field. 
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It is instructive to verify that this is not true in kf -5)). The lattice 
mat+np = m.3+n{—1+,/(—5)} 
is an ideal, for w = ./( -5) and 
wa = a+ 3f, wh = —2a—B. 
But, as is shown by Fig. 8 (and may, of course, be verified analytically), 


the lattice is not similar to the lattice of all integers of the field. 


15.8. Other fields. We conclude this chapter with a few remarks 
about some non-quadratic fields of particularly interesting types. We 
leave the verification of most of our assertions to the reader. 

G) The field k(V2+1). The number 

b= v24i 
satisfies M2249 = 0, 


and the number defines a field which we denote by k(42+i). The 
numbers of the field are 


(15.8.1) é = rtsitiv2+uiv2, 
where r, $, ¢, u are rational. The integers of the field are 
(15.8.2) é= atbitev2+div2, 


where a and b are integers and c and d are either both integers or both 
halves of odd integers. 

The conjugates of é are the numbers £, č, £, formed by changing the 
sign of either or both of i and V2 in (15.8.1) or (15.8.2), and the norm 
Né of £ is defined b 

is Y Ne= Eads 


Divisibility, and so forth, are defined as in the fields already considered. 
There is a Euclidean algorithm, and factorization is unique. 
(ii) The field k(V2+-3), The number 


b= V2+N3, 
satisfies the equation M—10#+1 = 0. 
The numbers of the field are’ 
E = r+sv2+tv3+uv6, 
and the integers are the numbers 
E = a+bv2+cv3+dv6, 
where a and c are integers and b and d are either both integers or both 


+ Theorem 215 stands in the field gg stated in § 12.8. The proof demanda some 
calculation. 
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halves of odd integers. There is again a Euclidean algorithm, and 
factorization is unique. 
These fields are simple examples of ‘biquaclratic fields. 
(iii) The field k(e**'). The number ĝ = ei”! satisfies the equation 
5 g 
Pe 4PH = 0. 


The field is, after k(i) and k(p), the simplest ‘cyclotomic field.t 
The numbers of the field are 


E= r+sh+tP+ud, 
and the integers are the numbers in which r,s, t, u are integral. The 


conjugates of é are the numbers &,, éz, é obtainecl by changing # into 
0, 93, 34, and its norm is 
Né = éé £5 éz 


There is a Euclidean algorithm, and factorization is unique. 

The number of unities in k(i) and kp) is finit?. In k(e'") the number 
is infinite. Thus (14.9) (9482493494) 
and $+ 324-954 # = —1, so that 1+8 and all its powers are unities. 

It is plainly this field which we must consider if we wish to prove 
‘Fermat’s last theorem’, when n = 5, by the method of § 13.4. The 
proof follows the same lines, but there are various complications of 
detail. 

The field defined by a primitive nth root of unity is simple, in the 
sense of § 14.7, when} 


n = ð, 4 5, 8. 


NOTES ON CHAPTER XV 


§ 15.5. Lucas stated two tests for the primality of Mp but his statements of 
his theorems vary, and he never published any complete proof of either. The 
argument in the text is due to Western, Journal London Math. Soc. "7 (1932), 
130-7, The second theorem, not proved in the text, is that referred to in the 
penultimate paragraph of the section. Western proves this theorem by using the 
field k(N3). Other proofs, independent of the theory of algebraic numbers, have 
been given by D. H. Lehmer, Annals of Math. (2) 31 (1930), 419-48, and Journal 
London Math. Soc. 10 (1935), 162-5. 

Professor Newman has drawn our attention to the following result, which can 
be proved by a simple extension of the argument of this section. 


t The field k(9), with 6 a primitive nth root of unity, is called cyclotomic because } 
and its powers are the complex coordinates of the vertices of a regular n-agon inscribed 
in the unit circle. 


‘ne Tth a number of k(V2+1). 


t elni m Avi 
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Let h < 2™be odd, M = @h— 1 = +2 (mod 5) and 
R, = wht}, R; = Rı—2 (j > 2). 
Then a necessary and sufficient condition for M to be prime is that 
R m-1 = 0 (mod M). 

This result was stated by Lucas [Amer. Journal of Math. 1 (1878), 310], who 
gives a similar (but apparently erroneous) test for numbers of the form 
N = h2™+ 1, The primality of the latter can, however, be determined by the 
test of Theorem 102, which also requires about m squarings and reductions 
(mod N). The two tests would provide a practicable means of seeking large 
prime pairs (p,p+2). 

§§ 15.6-7, These sections have been much improved as a result of criticisms 
from Mr. Ingham, who read an earlier version. The remark about polynomials 
in § 15.6 is due to Bochner, Journal London Math. Soc. 9 (1934), 4. 

§ 15.8. There is a proof that k(e8"*) is Euclidean in Landau, Vorlesungen, iii. 
228-31. 


XVI 
THE ARITHMETICAL FUNCTIONS 4(x), (n), d(n), a(n), r(n) 


16.1. The function ¢(n). In this and the next two chapters we 
shall study the properties of certain ‘arithmetical functions’ of n, that 
is to say functions f(n) of the positive integer n defined in a manner 
which expresses some arithmetical property of n. 

The function ¢(n) was defined in § 5.5, for n > 1, as the number of 
positive integers less than and prime to n. We proved (Theorem 62) 
that 


(16.1.1) =n I] (1-3). 


This formula is also an immediate consequence of the general principle 
expressed by the theorem which follows. 


a 5 ol .) THEOREM 260. If there are N objects, of which N, have the property œ, 
oe ” Ng have 8... » Nig have both wand B ,..., Nygy have a, B, oe „and SO ony; 
then the number of the objects which hae none of «, B, y,.. 


_ (16.1.2) NN Ng apte Nagy — 

Suppose that 0 is an object which has just k of the properties a, f,... . 
Then 0 contributes 1 to N. If k > 1, 0 also contributes 1 to k of 
Ny, Ng- to $k(k—1) of Nyg,..., to 

R(k- 1)(k-2) 
1.2.3 


Nagy and so on. Hence, if k > 1, it contributes 
iz == k(k—1)(k—2) 
1.2.3 
to the sum (16.1.2). On the other hand, if k = O, it contributes 1. 
Hence (16.1.2) is the number of objects possessing none of the pro- 
perties. 

The number of integers not greater than n and divisible by a is 


ra 
4 
If a is prime to b, then the number of integers not greater than n, and 


divisible by both a and b, is | j 
zl 


234 ARITHMETICAL FUNCTIONS [Chap. XVI 


and so on. Hence, taking (Y, $, y ,... to be divisibility by a, b, € ,..., we 
obtain 

THEOREM 261. The number of integers, less than or equal to n, and not 
divisible by any one of a coprime set of integers a, 6,..., is 


[x] > [3]+ > Fie 


If we take a, b,... to be the different prime factors p, p’,... of n, we 
obtain 


n 1 
(16.1.3) gm)=n—~ > 2+ 5H. (1-5), 
$ Doe =, 

which is Theorem 62. 


16.2. A further proof of Theorem 63. Consicler the set of n 
rational fractions 


(16.2.1) (1 <hgn). 


We can express each of these fractions in ‘irreducible’ form in just one 
way, that is, de. 
—_ a’ 
where d |n and 
(16.2.2) l<ac<d, (a,d) = 1 
and a and d are uniquely cletermined by h and n. Conversely, every 
fraction ajd, for which d n and (16.2.2) is satisfied, appears in the set 
(16.2.1), though in general not in reduced form. Hence, for any func- 
tion F(x), we have 

h a 
(16.2.3) > e) D r(3) 


shes d\n 1<a<d 
(a,d) = 


J 


y! 


Again, for a particular d, there are (by definition) just ¢(d) values of 
a satisfying (16.2.2). Hence, if we put F(x) = 1 in (16.2.3), we have 


16.3. The M obius function. The Mobius function p(n) is defined 
as follows : 


(i) eQ) = 1; 
(ii) p(n) = 0 if n has a squarecl factor; 


(iii) (py PeP) = (—1)* if all the primes p,, Pg... Pg are different. 
Thus p(2) = -1, p(4) = 0, p(6) = 1. 
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THEOREM 262. p(n) is multiplicative.~ 


This follows immediately from the definition of p(n). 
From (16.1.3) and the definition of u(n) we obtain 


(16.3.1) o(n) =n > Mo => Sud) = > an(3) = auld 


d\n 
Next, we prove 


THEOREM 263: 


Z wd) =1v= l Fpl) = 0 @ >I), 


dln 
THEoREM 264. If n > 1, and k is the number of different prime factors 


of n, then 5 \u(d)| = Qk. 
dn 


In fact,if k > 1 andn = př..pý*, we have 
2 Hd) = 1+ X elp)t 3 epp) + 
dn 4 i, 


ko k 
0 Ð 
while, if n = 1, p(n) = 1. This proves Theorem 263. The proof of 
Theorem 264 is similar. There is an alternative proof of Theorem 263 
depending on an important general theorem. 


= l-k+ +... = (1—1) = 0, 


THEOREM 265. Jf f(n)is a multiplicative function of n, then so is 
gn) = Sf (a) 


If (n,n’)= 1, cl n, and d mw, then (d,d')= 1 and c = dd’ runs 
through all divisors of nn’. Hence 


son’) = ZIO > Jd) 
= DI ZS a) = gog’), 
To deduce Theorem 263 we write f(n) = p(n), so that 
gn) = > Hd). 
Then g(1) = 1, and g(p”) = 1+pu(p) = 0 
when m > 1. Hence, when n = pf...pj' > 1, 


gin) = 9(PT)g( ps)... = 0. 


+ See § 5.5. 
{A sum extended over all pairs d, d’ for which dd’ = n, 
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16.4. The Mobius inversion formula. In what follows we shall 


make frequent use of a general ‘inversion’ formula first proved by 
Mobius. 


TueoremM 266. If g(n) = 20) 
= n ae n\ 
then fin) = > ula 2 maof) 
In fact 
aie (3) = Za ) Bie) = EMOH = Fo) Z old. 


The inner sum here is 1 if nfc = 1, ie. if c = n, and 0 otherwise, by 
Theorem 263, SO that the repeated sum reduces to fn). 
Theorem 266 has a converse expressed by 


T HEOREM 267: 


Jin) = > H7 —> g(n) = >» fA 


din 


The proof is similar to that of Theorem 266. We have 


Rut = Ha) = Z Dole 


-go -Zo zg- 


If we put g(n) = n in Theorem 267, and use (16.3.1), so that 
fín) = ¢(n), we obtain Theorem 63. 


As an example of the use of Theorem 266, we give another proof of 
Theorem 110. 


We suppose that d p— 1 andc d, and that x(c) is the number of 
roots of the congruence x? =-1(modp) which belong to c. Then (since 
the congruence has d roots in all) 


x(c) = d; 
cid 


from which, by Theorem 266, it follows that 
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16.5. Further inversion formulae. There are other inversion 


formulae involving u(n), of a rather different type. 


Tomm 268. If G(x) = S F(?) 


n=1 


[x] x 
F(x) = > uwal?) 


n=1 


for all positive x,t then 


For 
J 
lr E 


[z] izin] 
Duwa) = Sum S (E) = > FEE) S mt = Fo, 
n=1 n=1 m=1 1<k<[z] nik 


by Theorem 263. There is a converse, viz. 


T HEOREM 269: 


This may be proved similarly. 
Two further inversion formulae are contained in 


T HEOREM 270: 


gle) = ¥ fona) = fle) = È pingla). 


The reader should have no difficulty in constructing a proof with 
the help of Theorem 263; but some care is required about convergence. 
A sufficient condition is that 


> |f(mnx)| = 2 d(k)| f (kæ)] 


mn 


should be convergent. Here d(k) is the number of divisors of k.| 


16.6. Evaluation of Ramanujan’s sum. Ramanujan’s sum c,(m) 
was defined in § 5.6 by 


(16.6.1) c,(m) = > (= 
hén \™ 


(Ayn) =1 
We can now express c,(m) as a sum extended over the common divisors 
of m and n. 
© à fn 
T HEOREM 271: c, (m) = > aaja 
dim,din 


ł An empty gum is as usual to be interpreted ag 0. Thus G(x) = 0 if O< y <1. 
{lf mn = & then n &,and & runs through the numbers 1, 2...., [z]. 
| See § 16.7. 
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If we write 


(16.2.3) becomes gin) = F f(d). 
din 
By Theorem 266, wc have the inverse formula 


(16.6.2) osz H(z, 


: din 
that is 


(16.6.3) r() = > (a) >. F(a) 


Ionan din 


= 
3 
s 


( 
We now take F(x) = e(mz). In this event, 


f(n) = elm) 

by (16.6.1), while g(n) = >, (=, 
fén \™ 

which is n or 0 according as n m or n fm. Hence (16.6.2) becomes 


_ ha 
pe 7 Pc 


Another simple expression for c,(m) is given by 
THEOREM 272. If (n,m)=a and n = aN, then 


_ BN) ¢4(n) 
C,(m) = ET 
By Theorem 271, 
elm) = 5 ar(") = > apie) = Y Suve). 
dja ed=a ela 


Now p(Ne) = p(N)u(c) or 0 according as (N, c) = 1 or not. Hence 
san Se EY 1 1 
cm) = ap(N) > == auM(1— Dea Le} 


cla 
(e,N)= 1 
where these sums run over those different p which divide a but do not 
divide N. Hence 


c, (m) = am TT (1 -;). 


But, by Theorem 62, 
p(n) n ial ( —ż) 
pin, pN P 


1 
N n, 1—2) =a 
AN) N anh P 
and Theorem 272 follows at once. 
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When m= 1, wehave c,,(1) = p(n), that is 
h 

16.6.4 = > ma 
(16.6.4) p(n) (5) 


16.7. The functions dín) and a,(n). The function d(n) is the 
number of divisors of n, including 1 and n, while o,(m) is the sum of the 
kth powers of the divisors of n. Thus 


a(n) = > dk, d(n) = > l, 
din din 
and d(n) = a,(n). We write u(n) for 0,(n), the sum of the divisors of n. 


If n= pt py...pf, 
then the divisors of n are the numbers 


by mè.: b 
Pi Prepih 


where 0<h<ca, 0 <b, <a, ... OLUA 
There are (a, +1)(a,+1)...(a@+1) 
of these numbers. Hence 
THEOREM 273: d(n) = ll (a,+1). 
i=l 


More generally, if k > 0, 
l 


an= $ S, 2 teeth JÅ tee +p. 


bi =0 b20 i=1 
Hence 
l (a;t+Dk _ ] 
THEOREM274: a,(n) = ial A) 
jaa \ Pi 
In particular, . 
L pitt l 
THEOREM 275: u(n) = [I t } 
ee 


16.8. Perfect numbers. A perfect number is a number n such 
that u(n) = 2n. In other words a number is perfect if it is the sum of 
its divisors other than itself. Since 1+2+3 = 6, and 


1424447414 = 28, 


6 and 28 are perfect numbers. 
The only general class of perfect num’bers known occurs in Euclid. 


Teorem 276. If 2"+1_ 1 is prime, then 2"(2"t1— 1) is perfect. 


240 ARITHMETICAL FUNCTIONS [Chap. XVI 


Write 2"*1_] = p,N = 2"p, Then, by Theorem 275, 
a(N) = (2"1—1)(p4-1) = 2"+(2e+1_ 1) = 2N, 
so that N is perfect. 


Theorem 276 shows that to every Mersenne prime there corresponds 
a perfect number. On the other hand, if N = 2”p is perfect, we have 


o(N) = (2H—1)(p+1} = 280g 
and so p = 2nt_}], 
Hence there is a Mersenne prime corresponding to any perfect number 
of the form 2%). But we can prove more than this. 
Tueorem 277. Any even perfect number is a Euclid number, that ig to 
say of the form 2"(2"+1—1), where 2"+1_] jg prime. 
We can write any such number in the form N = 2"), where n > 0 
and b is odd. By Theorem 275, u(n)is multiplicative, and therefore 
a(N) = o(2”)o(b) = (2"+1— 1)o(b). 
Since N is perfect, o(N) = 2N = 2b; 
b gn+1__ 
and so ' b) 3 a 
The fraction on the right-hand side is in its lowest terms, and therefore 
b = (241 1)e, u(b) = 2"+1¢, 
where ¢ is an integer. 
Ifc > 1, b has at least the divisors 
b, c, 1, 
so that u(b) > 6+c+ 1 = 2”+le4 1 > 2e = u(b), 
a contradiction. Hence c = 1, 
N = 27(2r+1_]), 
and a(2— 1) = PA, 
But, if 2”+1— lis not prime, it has divisors other than itself and 1, and 
o(2"tt_]) > 2, 
Hence 2”+1_] is prime, and the theorem is proved. 
The Euclid numbers corresponding to the Mersenne primes are the 
only perfect numbers known. It seems probable that there are no odd 
perfect numbers, but this has not been proved. The most that is known 


in this direction is that no odd perfect number can have less than six 
different prime factors or be less than 1:4 x 1014. 


16.9} ARITHMETICAL FUNCTIONS 241 


16.9. The function r(n). We define r(n) as the number of repre- 
sentations of n in the form 


n = AL BY 
where A and B are rational integers. We count representations a3 


distinct even when they differ only ‘trivially’, i.e. in respect of the sign 
or order of A and B. Thus 


0 = 0?+0?, r(0) = 1; 
1 =(+1%4+0?= O+(+1)%, r= 4; 
5 = (+2P+(+1? = (+1}+(42)},  r(5)= 8. 


We know already (§ 15.1) that r(n) = 8 when nis a prime 4m-+1; 
the representation is unique apart from its eight trivial variations. On 
the other hand, r(n) = 0 when n is of the form 4m-+3. 

We define x(n), for n > 0, by 


x(m)= 0 (2 n), x(n) = (~1p®? (2 fn). 
Thus y(n) assumes the values 1, 0, = 1, 0, 1,... for n = 1, 2,3 ,..,. Since 
4(nn’—1)—4(n—1)—4(n’—1) = E(n—1)(n’—1) = 0 (mod2) 
when n and m are odd, x(n) satisfies 
xian’) = x(n)x(n') 


for all n and n’. In particular y(n) is multiplicative in the sense of § 5.5. 
It is plain that, if we write 


(16.9.1) 8(n) = È x(d), 
then y 
(16.9.2) ò(n) = d,(n)—d,(n), 


where d,(n) and d,(n) are the numbers of divisors of n of the forms 
4m+ 1 and 4m-+ 3 respectively. 

Suppose now that 
(16.9.3) n = 2N = 2w = A J] pP I g, 


where p and q are primes 4m-+1 and 4m-+3 respectively. If there are 
no factors q, SO that []g° is ‘empty’, then we define v as 1. Plainly 


5(n) = S(N). 
The divisors of N are the terms in the product 
(16.9.4) TT 1-+p+...4-p") TT +94... +9). 


A divisor is 4m 1 if it contains an even number of factors q, and 4m+3 
5591 R 
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in the contrary case. Hence S(N) is obtained by writing 1 for p and 
-1 for q in (16.9.4); and 


(16.9.5) M=] [ef] (EED), 


If any s is odd, i.e. if v is not a square, then 
S(n) = SN) = 0; 
while Sín) = SN) = [] (r+1) = dp) 
if y is a square. 
Our object is to prove 
THEOREM 278: If n > 1, then 
r(n) = 46(n). 
We have therefore to show that r(n) is 4d(p) when v is a square, and 
zero otherwise. 
16.10. Proof of the formula for r(n). We write (16.9.3) in the 
form n = {(1-+4)(1—4)}* TI {(a+-bi)(a—bi)}" TI 9, 
where a and b are positive and unequal and 
p = a+6?, 
This expression of p is unique (after § 15.1) except for the order of a 
and b. The factors 


li, atbi, q 
are primes of k(i). 
If n = A424 B% = (A+Bi)(A-Bi), 

then 
A+Bi = #(1+ija(1—i)e JI {(a+bi(a—biyy TI g”, 
A-Bi = i“(1—4)™(1 +i) JI {(a—biy:(a+bi)"} TT g”, 

where 

t= 0, 1, 2, or 3, ata = a, T+, = f, 8,18 = S. 


Plainly s, = $, SO that every s is even, and v is a square. Unless this 
is SO there is no representation. 
We suppose then that 
v= [g= [0 
is a square. There is no choice in the division of the factors q between 
A+ Bi and A- Bi. There are 
4(a+1) IT (r+1) 
choices in the division of the other factors. But 
l-i 
1-7 
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is a unity, SO that a change in o, and qa, produces no variation in A 
and B beyond that produced by variation of ¢, We are thus left with 


a J] (r+) = 4d(p) 
possibly effective choices, i.e. choices which may produce variation in 
A and B. 
The trivial variations in a representation n = A2-+ B? correspond 
(i) to multiplication of A+ Bi by a unity and (ii) to exchange of A+ Bi 
with its conjugate. Thus 


1(A+ Bi) = A+ Bi, (A+ Bi) = —B+Ai, 
(A+Bi) = -A-Bi, i#(A+Bi) = B-Ai, 
and A- Bi, -B-Ai, -A+ Bi, B+ Ai are the conjugates of these four 
numbers. Any change in ¢ varies the representation. Any change 


in the r, and r, also varies the representation, and in a manner not 
accounted for by any change in ¢; for 


a(1+7)9(1—1)™ TT {(a+-bt)"(a—bi)™} 
= (1+ i)s(1—i)% TI {(a+-bi)yi(a—by} 
is impossible, after Theorem 215, unless r= ri and Ty = raf There 
are therefore 4d(u) different sets of values of A and B, or of representa- 
tions of n; and this proves Theorem 278. 


NOTES ON CHAPTER XVI 


§ 16.1. The argument follows Pélya and Szegô, ii. 119-20, 326-7. 

§§ 16.3-5. The function p(n) occurs implicitly in the work of Euler as early as 
1748, but Mobius, in 1832, was thé first to investigate its properties systematically. 
See Landau, Handbuch, 567-87 and 901. 

§ 16.6. Ramanujan, Collected papers, 180. Our method of proof of Theorem 271 
Was suggested by Professor van der Pol. Theorem 272 is due to Hélder, Prace 
Mat. Fiz, 43 (1936), 13-23. See also Zuckermann, American Math. Monthly, 59 
(1952), 230 and Anderson and Apostol, Duke Math. Journ, 20 (1953), 211-16. 

§§ 16.7-8. There is a very full account of the history of the theorems of these 
sections in Dickson, H astory, i, chs. i-ii. For the theorems referred to at the end 
of § 16.8, see Kanold, Journ. fiir Math. 186 (1944), 25-29 and Kihnel, Math. Zeit. 
52 (1949), 202-1 1. We have to thank Mr. C. J. Morse for pointing out an error in 
our earlier proof of Theorem 277. 

§ 16.9. Theorem 278 wag first proved by Jacobi by means of the theory of 
elliptic functions, It is, however, equivalent to one stated by Gauss, D.A., § 182; 
and there had been many incomplete proofs or statements published before. See 
Dickson, History, ii, ch. vi, and Bachmann, Niedere Zahlentheorie, ii, ch. vii. 


+ Change of 7, into 7, and +, into rı (together with corresponding changes in {, a, œs) 
changes A + Bi into its conjugate. 


XVII 
GENERATING FUNCTIONS OF ARITHMETICAL FUNCTIONS 


17.1. The generation of arithmetical functions by means of 
Dirichlet series. A Dirichlet series is a series of the form 


fea) 


Xn 


ns 
n=1 

The variable s may be real or complex, but here we shall be concerned 

with real values only. F(s), the sum of the series, is called the generating 
function of ap. 

The theory of Dirichlet series, when studied seriously for its own 
sake, involves many delicate questions of convergence. These are mostly 
irrelevant here, since we are concerned primarily with the formal side 
of the theory; and most of our results could be proved (as we explain 
later in § 17.6) without the use of any theorem of analysis or even the 
notion of the sum of an infinite series, There are however some theorems 
which must be considered as theorems of analysis; and, even when this 
is not SO, the reader will probably find it easier to think of the series 
which occur as sums in the ordinary analytical sense. 

We shall use the four theorems which follow. These are special cases 
of more general theorems which, when they occur in their proper places 
in the general theory, can be proved better by different methods. We 
confine ourselves here to what is essential for our immediate purpose. 

(1) If > a, 2-8 is absolutely convergent for a given s, then it is 
absolutely convergent for all greater s. This is obvious because 


(17.1.1) F(s) = 


lon MH] < jann] 
when n > 1 and 8, > s, 


(2) If > a, n~ is absolutely convergent for s > 8), then the equation 
(17.1.1) may be differentiated term by term, so that 


(17.1.2) F(s) == 5 eats 
for s >> 8. TO prove this, suppose that 
So < Std = Sı L'S < So 
Then logn < K(8)ni®, where K(S) depends only on 8, and 
a, log n 


ns 


An 


SAO ai 
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for all s of the interval (s,, sa). Since 


2 


is convergent, the series on the right of (17.1.2) is uniformly convergent 
in ($,, $2), and the differentiation is justifiable. 


(3) If F(s) = da,n-* = 0 
for 8 > 8 , then a, = 0 for all n, To prove this, suppose that æn is the 
first non-zero coe elent, Then 


(1713) 0=F(s) =a „mfi + zes (ZEI) amama.) 


m Am \ M 


An 


notè 


= a, m-{1+G(s)}, 
say. If 5 < sı < s, then 


m--k\-8 < (mt ~(8-8)) /m +-k\-% 
m ) S| m 


ae aa s © _|otn-til 
n aol < a) mi D Ta T 
which tends to 0 when s > œ. Hence 
[1+ G(s)| > 3 
for sufficiently large s; and (17.1.3) implies «,, = 0, a contradiction. 
It follows that if > opn = $ ppn 


for s > s, then a, = 8, for all n. We refer to this theorem as the 
‘uniqueness theorem’. 

(4) Two absolutely convergent Dirichlet series may be multiplied in 
a manner explained in § 17.4. 


17.2. The zeta function. The simplest infinite Dirichlet series is 
2. } 
oy ns 


(17.2.1) f(s) = 
n=l 
It is convergent for s > 1, and its sum €(s) is called the Riemann zeta 


function. In particulary 


© 2 
(17.2.2) “=> 5 =T. 
n=l 


Tt ¢(2n) is a rational multiple of 7?” for all positive integral n, Thus ¢(4) = 


and generally 


&(2n) = 


where B, is Bernoulli’s number. 


246 GENERATING FUNCTIONS OF [Chap. XVII 


If we differentiate (17.2.1) term by term with respect to s, we obtain 


Zl 
THEOREM 279: ç (s) = = = (s > 1). 
1 
The zeta function is fundamental in the theory of prime numbers. 
Its importance depends on a remarkable identity discovered by Euler, 
which expresses the function as a product extended over prime numbers 


only. 


Torm 280. ifs > 1 then 


«= [TT 
i p 


Since p > 2, we have 
1 
„æ = ] —8 =s 
(17.2.3) Ip“ a iia -+ 
for s > 1 (indeed for s > 0). If we take p = 2, 3,..., P, and multiply 
the series together, the general term resulting is of the type 
2-4283—438, |, P-ap8 — N-S, 


where n = 2%3%,,. Pap (a, 20, a, 20 ,.., Ap > 0). 


A number n will occur if and only if it has no prime factors greater 
than P, and then, by Theorem 2, once only. Hence 


m n`, 
Ue pA Š, 


the summation on the right-hand side extending gyer numbers formed 


from the primes up to P. 
These numbers include all numbers up to P, so that 


0 < s nse ns y 8 
Pee cia. 
and the last sum tends to 0 when P -> œ, Hence 
ya = lim È n8 = lim : 1 P 
P>0(P) P+ jg 1—p= 


the result of Theorem 280. 
Theorem 280 may be regarded as an analytical expression of the , 
fundamental theorem of arithmetic. 


17.3. The behaviour of f(s) when s —> 1. We shall require later 
to know how ¢(s) and ¢’(s) behave when s tends to 1 through values 


greater than 1. 
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We can write {(s) in the form 


œ a œ wt 
Moet) =È n= fe a*de+ > | (n-*—a~-*) dx. 
1 1 1 A 
r 1 
Here fatde=. 1> 
1 


since s > 1. Also 


x 
8 
0 <nt—at = | st- dt < n? 


n 
ifn < x < n-+1, and so 
n+l 
0 < | (na) de< <; 
n 


and the last term in (17.3.1) is positive and numerically less than 
s > n. Hence 


THEOREM 281: t(s) = —~ + O(1). 


Also log {(s) = log — 4 log{1+0(s—1)}, 
and so 
THEOREM è 282: log {(s) = log — + O(s—1). 


We may also argue with 
œ n+i 


— = $ n*logn =| aloga dx + > f n~*logn—2x~®log x) dx 


1 
much as with ¢(s), and deduce 


Tuzorem 283: t(s) = SGa OM 


I n particular . ils) ~ —. 


This may also be proved by observing that, if 3 > 1, 


(1— 21-8) (8) = 1-84 2-84 3-4-4... —2(2-§ 4-84 6-84...) 
= ]-8§— 2-94. 3-8 
and that the last series converges to log 2 for g = 1. mone 


(S--I(S) = (1—2?-*)f(s) 


I a log 2 


log z 
t We assume here that uni =} 


whenever the series on the right is aN a theorem not included in those of § 17.1. 
We do not prove this theorem because we require it only for an alternative proof. 
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17.4. Multiplication of Dirichlet series. Suppose that we are 
given a finite set of Dirichlet series 


(17.4.1) Sann, > pan, È ws 
and that we multiply them together in the sense of forming all possible 
products with one factor selected from each series. The general term 


resulting is - - at ha eg 
8 08 VE. Yy WE... = Aupo Yw NE, 


where n = uvw.... If now we add together all terms for which n has 
a given value, we obtain a single term y, ~*, where 


(17.4.2) Xa = È OnboYwe 
UVW.. =N 


The series È x, n-*, with x, defined by (17.4.2), is called the formal 
product of the series (17.4.1). 

The simplest case is that in which there are only two series (17.4.1), 
> xus and X B,v-s. If (changing our notation a little) we denote 
their formal product by > y„n-, then 


(17.4.3) Yn = È %uBo = Ð Bua = È aba 
Uv=n daln din 


a sum of a type which occurred frequently in Ch. XVI. And if the 
two given series are absolutely convergent, and their sums are F(s) and 


G(s), then 
= Žau > B,v*= ¥ abw) 
u v Gv 
= 2 n” È aube =D an”, 


since we may multiply two EN convergent series and arrange 
the terms of the product in any order that we please. 

THEOREM 284. If the series 

F(s) = ¥ aus, G(s) = > ß,v-8 
are absolutely convergent, then 
= > Yn ns, 

where y, is defined by (17.4.3). 

Conversely, if H(s) = > pn = F(s)G(s) 
then it follows from the uniqueness theorem of § 17.1 that 6, = y,. 


Our definition of the formal product may be extended, with proper 
precautions, to an infinite set of series. It is convenient to suppose that 


a= 6 = yj S 2.05 
Then the term Ot Ba Vagos 
in (17.4.2) contains only a finite number of factors which are not 1, 
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and we may define y, by (17.4.2) whenever the series is absolutely 


convergent.? 
The most important case is that in which f(1) = 1, f(n) is multi- 


plicative, and the series (17.4.1) are 


(17.4.4) \+f(p)p + f(p?)p +... +f (pp B+... 
for p = 2, 3, 5,...; SO that, for example, a, is f(2%) when u = 2% and 0 
otherwise. Then, after Theorem 2, every n occurs just once as a product 
uv'w... with a non-zero coefficient, and 

Xn = f(t) fs (pe)... = f(n) 
when n= p{1 pg. ... It will be observed that the series (17.4.2) reduces 
to a single term, SO that no question of convergence arises. 

Hence 

Tuzorem 285. If f(1) = 1 and fin) is multiplicative, then 

Zinn 
as the formal product of the series (17.4.4). 

In particular, } n-8 is the formal product of the series 

l+p%+p-8-+4.... 

Theorem 280 says in some ways more than this, namely that {(s), 
the sum of the series X n-* when s > 1, is equal to the product of the 
sums of the series 1 +-8+-p-*s,... The proof can be generalized to cover 
the more general case considered here. 

Tuzorem 286. If f(n) satisfies the conditions of Theorem 285, and 


(17.4.5) > fin) 
is convergent, then 


F(s) = > finm = TT +4 (p)p-*-+h(p?)p-* +... 


p 
We write Fs) =1+flp)p +f) E+ ...; 
the absolute convergence of the series is a corollary of the convergence 
of (17.4.5). Hence, arguing as in § 17.2, and using the multiplicative 
property off(n), we obtain 


TI 4) = Dhan. 
psP (P) 


Since | S finn Sfin) < $ finn > 0, 
n=1 (P) P+1 


the result follows as in § 17.2. 


+ We must assume absolute convergence because we have not specified the order in 
which the terms are to be taken. 
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17.5. The generating functions of some special arithmetical 
functions. The generating functions of most of the arithmetical func- 
tions which we have considered are simple combinations of zeta functions. 
In this section we work out some of the most important examples. 


THEOREM 287: => p(n 


This follows at once from Theorems 280, 262, and 286, since 


rei am H (1—p) = J] +e o = Š (njn. 
Us—1)_ & gin 
f(s) Anr 


1 
By Theorem 287, Theorem 284, and (16.3.1) 


EASA = 2 HG) AA 


œ 
n=1 djn n=1 
foe) 
THEOREM 289: £2(s) = a 


THEOREM 288: 


) (s > 2). 


DE 
iM 


d(n) 
ns 


(s > 1). 


Torem 290: Ss E(s)f(s—1) = > a(n) (s > 2). 
= a 

These are special cases of the theorem 

THEOREM 291: 


ee(s—h) = SO) (> 1, s> EH). 
In fact oe 
lan S11 oln) 
t(s)t(s—k) = -> == -Y d = e, 
2 7 2a d\n 2 T 
by Theorem 284. 
. (m Z e (m) 
THEOREM 292: aam — a (s > 1) 


and so z 
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Finally 5 d-s =m! ds = mi-o, (m). 
dim dim 
In particular 
cy(m) 6 olm) AOOTET oes 
THEOREM 293: » z —— 


n nm m . 
n OMANA avv 

17.6. The analytical interpretation of the Mobius formula. 
Suppose that 


and that F(s) and G(s) are the generating functions of f(n) and g(n). 
Then, if the series are Pie convergent, we have 


and therefore 


Q) Sgn) S pln) © kin) 
m- P 2 
s n 
where h(n) = 2g , 
It then follows from the uniqueness theorem of § 17.1 (3) that 
h(n) == f(n), 


which is the inversion formula of Mobius (Theorem 266). This formula 
then appears as an arithmetical expression of the equivalence of the 
equations 


We cannot regard this argument, as it stands, as a proof of the 
Mobius formula, since it depends upon the convergence of the series 
for F(s). This hypothesis involves a limitation on the order of magni- 
tude off(n), and it is obvious that such limitations are irrelevant. The 
‘real’ proof of the Mobius formula js that given in § 16.4. 


We may, however, take this opportunity of expanding some remarks which 
we made in § 17.1. We could construct a formal theory of Dirichlet series in 
which ‘analysis’ played no part. This theory would include all identities of the 
‘Mobius’ type, but the notions of the sum of an jnfinite series, or the value of an 
infinite product, would never occur. We shall not attempt to construct such a 
theory in detail, but it is interesting to consider how it would begin. 

We denote the formal series > ann~ by A, and write 

A = $ apn. 
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In particular we write 
T= 1. 1°§4+0.27%+0.3-8+ ,,,, 
Z=1.1-§4+1,2-§+1.3-8+.... 
M = p(1)l-*+p(2)2-*4 p(3)3-8 +... . 


By A=B 
we mean that a, = Da for gall values of n, 
The equation AxB=0 


means that C is the formal product of A and B, in the sense of § 17.4. The 
definition may be extended, as in § 17.4, to the product of any finite number of 
series, or, with proper precautions, of an infinity. It is plain from the definition 
that 


AxB=BxA, AXBXC=(AXB)XC=AxX(BxO), 


and SO on, and that AXI=A. 
The equation AxZ=B 
means that b, = ¥ aq. 
din 

Let us suppose that there is:a series L such that 

ZxL= I, 
Then A= AXI= Ax(ZxL)= (AXZ)XL = BXL, 
Le. an = > bg lala: 

d\n 
The Mobius formula asserts that l, = p(n), or that L = M, or that 
(17.6.1) ZxM= I; 
and this means that > wld) 
d\n 


is 1 when 7 = 1 and 0 whenn > 1 (Theorem 263). 
We may prove this as in 8 16.3, or we may continue as follows. We write 


P, = 1—p-, Qy = 1¢p'+p-84.... 
where p is a prime GO that Py for example, is the series A in which a, = 1, 
a, = = 1, and the remaining coefficients are 0); and calculate the coefficient of 
n~ in the formal product of P, and Q,. This coefficient is 1 if n = 1, 1-1 = 0 
if n is a positive power of p, and 0 in all other cases; so that 


P,xQ, = 1 
for every p. 
The series Pp, Qp» and J are of the special type considered in § 17.4; and 
Z-T1% M- ITP, 
ZXM = II &xII Pp 
while TI (Q xP) = Tl = 1. 
But the coefficient of n~ in 
(Qe X Qs X Qs X.) X (Pa X PX P; X...) 
(a product of two series of the general type) is the same as in 
Q X Pa X Q3 X Pa X Q5 X PX 
or in (Qa X Fa) x(Q3x Ps) x(Q5xP5)x... 
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(which are each products of an infinity of series of the special type); in each 


case the x, of § 17.4 contains only a finite number of terms. Hence 
ZxM = II xT = TI (Qx P) = T= 1. 

It is plain that this proof of (17.6.1) is, at bottom, merely a translation into 
a different language of that of § 16.3; and that, in a simple case like this, we 
gain nothing by the translation. More complicated formulae become much easier 
to grasp and prove when stated in the language of infinite series and products, 
and it is important to realize that we can use it without analytical assumptions. 
In what follows, however, we continue to use the language of ordinary analysis. 

17.7. The function A(n). The function A(n), which is particularly 
important in the analytical theory of primes, is defined by 
A(n) = logp (n= p”), 
A(n) =0 (n Æ p™), 
i.e., as being logp when x is a prime p or one of its powers, and 0 
otherwise. 
From Theorem 280, we have 


log ¢(s) = > og) 


Differentiating with respect to s, and observing that 


4 iog 1l _ logp , 
ds ~1—p-8 p—l 
we obtain 
(17.7.1) SE lsgr 


The differentiation is legitimate because the derived series is uniformly 
convergent for s > 1+8 >1.t 
We may write (17.7.1) in the forrn 


E T ag 
tej 7 > Er ŽP 


and the double series $ X p-™logp is absolutely convergent when 
s > 1. Hence it may be written as 
> p-™ log p == } A(n)n, 
pm 
by the definition of A(n). 
‘(s) 


THEOREM 294: -e = > A(n)n-§ (s> 0. 
: g < logn 
Since g(s) = Ta 


t The nth prime Pa is greater than n, and the series may be compared with © ns log n, 
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by Theorem 279, it follows that 

SA(n) 1 Slogn Syn) Š logn 
= ne TaD ns ne > ns? 
n=l n=l n=1 n=1 


=. logn SAM) & 1 S A(n) 
ji > ge = 0D ne D ns 
n=1 n=1 n=1 n=1 
From these equations, and the uniqueness theorem of § 17.1, we deducet 
n 
295: = a : 
THEOREM A(n) = > a(z) d 
din 
296: logn = > A(d) 
d\n 


‘THEOREM 
e may also prove these theorems directly. If n = Į [ %, then 


>A) = J log p. 


pln 

The summation extends over all values of p, and all positive values of 

a for which p* n, SO that logp occurs a times. Hence 
> logp = ł alogp = log [| p° = logn 


pln 
This proves Theorem 296, and Theorem 295 follows by Theorem 266 


. d{l f(s) l t(s) 
Agon -zta = Pe) l 7a 
so that De es voeta > j D — 
n=1 n=1 n=1 
Hence, as before, we deduce 
THEOREM 297: (n)logn = > Haa. 
In 
Similarly eS i we ea} 


and from this (or from Theorems 297 and 267) we deduce 
THEOREM 298: n)=- = 5 u(d)logd. 
djn 
17.8. Further examples of generating functions. We add a few 
examples of a more miscellaneous character. We define d,(n) as the 
number of ways of expressing n as the product of k positive factors 
(of which any number may be unity), expressions in which only the 


t Compare § 17.6. 
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order of the factors is different being regarded as distinct. In particular, 
dn) = d(n). Then 


THEOREM 299: t(s) = a (s > 1). 


Theorem 289 is a particular case of this theorem. 


F (2s) Lp \ 1\-1 
see Ta eee) 


where X(n) = (- 1)?, p being the total number of prime factors of n, 
when multiple factors are counted multiply. Thus 


Turorem 300: §(28) = n (s >1). 


£(s) n° 
Similarly we can prove 
G(s) _ Ss Quen 
T 301: aL eae 1 
HEOREM L28) >, 7 (s > 1), 


where wn) is the number of different prime factors of n. 


A number n is said to be quadratfrer} if it has no squared factor. If 
we write q(n) = 1 when n is quadratfrei, and q(n) = 0 when n has a 
squared factor, SO that q(n) = |u(n)|, then 


Se) Ty (2) Tate = py Oe 
p 


t(2s) 1—p** An 
by Theorems 280 and 286. Thus 


THEOREM 302: 


U Sa) LS e s> ay. 
e) ~ 2, s E ee 


n 
More generally, if (2) 
power as a factor, then 


x t(s) S a > 
Tuzorem 303: — =— (§ 1). 
é ( ) 


0 or 1 according as n has or has not a kth 


t Wc have already used this word in § 2.6 (p. 16) ; there is no convenient English 
word. 
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Another example, due to Ramanujan, is 


ts) = $ Se )” (6s > 1). 
THEOREM 304: 


(2s) ns 
This may be proved as follows. We have 
tHe) 1—p-8 1-+p-8 
we) l larr -l oar 
Now 
14% _ 2 
T (1+2%)(1+32-+ 627+...) 
= 144249074... =), (I++ 1)2a! 
t(s) = + 1)2y-!8 
Hence ts) 7 ($0 p \ 
The coefficient of n-*, when n = ph pe... . is 


(+1) +1)%... = {d(n)}, 
by Theorem 273. 
More generally we can prove, by similar reasoning, 


Torm 305. Ifs, s-a, s-b, and s-a-b are all greater than 1, then 
E(s)e(s—a)e(s—b)t(s—a—b) — SB o,(n)o,(n) 
C(2s—a—b) => ns | 
n=l 
17.9. The generating function of r(n). We saw in § 16.10 that 
r(n) = 4 p> x(d) 


where x(n) is 0 when n is even and (— a when n is odd. Hence 


`y e Zed xM = at(o) 14s), 


where L(s) = 1-8—3-845-8— 

ifs > 1. 
HEOREM : r(n) — 
T 306: > ao 4£(s)L(s) (s > 1). 
The function n(s) = 1-8§~2-843-8— 


is expressible in terms of {(s) by the formula 


nls) = (1—2*-8)k{s); 


but L(s), which can also be expressed in the form 


=T io 
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is an independent function. It is the basis of the analytical theory of 
the distribution of primes in the progressions 4m+ 1 and 4m-+3. 
17.10. Generating functions of other types. The generating 
functions discussed in this chapter have been defined by Dirichlet 
series; but any function 
F(s) = È a, Up(8) 
may be regarded as a generating function of «„. The most usual form 


of u,,(3) is TOS 


where 4, is a sequence of positive numbers which increases steadily to 
infinity. The most important cases are the cases À„ = log n and A, = n. 
When À, = log n, u,(s) = n-8, and the series is a Dirichlet series. When 
A, = ^, it is a power series in 


% =x e8, 
Since m*,n- == (mn), 
and LM gh = ymin 


the first type of series is more important in the ‘multiplicative’ side of 
the theory of numbers (and in particular in the theory of primes). Such 


functions as S p(n)”, S d(n)a”, F Alna” 
are extremely difficult to handle. But generating functions defined by 
power series are dominant in the ‘additive’ theory.t 

Another interesting type of series is obtained by taking 


e"s gr 
Uals) = [om = T 
it F A 
We write (2) = > nyg 


and disregard questions of convergence, which are not interesting here. f 
A geries of this type is called a ‘Lambert series’. Then 


F S cs) œ% fii 23 co) b oN, 
e ain a = 

where by = 2 a. 

This relation between the a and b is that considered in §§ 16.4 and 17.6, 
and it is equivalent to L(s)fls) = 9(s), 


where f(s) and g(s) are the Dirichlet series associated with a, and b,. 
+ See Chs. XIX-XXI. 
t All the series of this kind which we consider me absolutely convergent when 


0<z<l. 
5591 § 
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THEOREM 307. If 


f(s) m > ann’, g(s) = > b 078, 
then F(x) => 4, sd -= > b, a" 


if and only if C(s)f(s) = g(s). 


If f(s) = X p(n)n-*, g(s) = 1, by Theorem 287. If f(s) = 


gs) = U(s—1) = >=, 


by Theorem 288. Hence we derive 


= n 
THEOREM 308: saa =T. 
i —2z 
o n 
THEOREM 309: Seine = A 
Z 1—a2"  (l—zx)}? 


Similarly, from Theorems 289 and 306, we deduce 


Tueorem 310: 


Š dnye" = otra 


l—x — g? 


tiz 


TuEeoremM 311: 


9 x x3 x5 
fns 4 [| 
pa (n) h x l tI x 


(Chap. XVII 


2 pinn, 


Theorem 311 is equivalent to a famous identity in the theory of eliptic 


functions, viz. 


THEOREM 312: 


(142042044 2794 ..)2 = 14475- z 


l—g IA ee 


In fact, if we square the geries 


1427422442794... = $ gm 
-0 


=) 


the coefficient of g” is r(n), since every pair (m,, m,) for which 


m{+m§ = n contributes a unit to it.t 


t Thus 5 arises from 8 pairs, viz. (2, 1), (1, 2), and those derived by changes of sign. 
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NOTES ON CHAPTER XVII 


§ 17.1. There is a short account of the analytical theory of Dirichlet series in 


Titchmarsh, Theory of functions, ch. ix; and fuller accounts, including the theory 
of series of the more general type 


Ya. 


(referred to in § 17.10) in Hardy and Riesz, The general theory of Dirichlet’s series 
(Cambridge Math. Tracts, no. 18, 1915), and Landau, Handbuch, 103-24, 723-75. 

§ 17.2. There is a large literature concerned with the zeta function and its 
application to the theory of primes. See in particular the books of Ingham and 
Landau, and Titchmarsh, The Riemann zeta-function (Oxford, 1951). 

For the value of {(2n) see Bromwich, Infinite series, ed. 2, 298. 

§ 17.3. The proof of Theorem 283 depends on the formulae 


x 
0 <n logn—alogax = J e-Uelogt— 1) dt < =jlogin+1), 
(i 


valid for 3 <n < % < n+l ands > 1. 

There are proofs of the theorem referred to in the footnote to p. 247 in Landau, 
Handbuch, 106-7, and Titchmarsh, Theory of functions, 289-90. 

§§ 17.5-10. Many of the identities in these sections, and others of similar 
character, occur in Pólya and Szegé, ii. 123-32, 331-9. Some of them go back 
to Euler. We do not attempt to assign them systematically to their discovcrcrs, 
but Theorems 304 and 305 were first stated by Ramanujan in the Messenger of 
Math. 45 (1916), 81-84 (Collected papers, 133-5 and 185). 

§ 17.6. The discussion in small print is the result of conversation with Professor 
Harald Bohr. 


§ 17.10. Theorem 312 is due to Jacobi, Fundamenta nova (1829), § 40 (4) and 


§ 65 (6). 


XVIII 
THE ORDER OF MAGNITUDE OF ARITHMETICAL FUNCTIONS 


18.1. The order of d(n). In the last chapter we discussed formal 
relations satisfied by certain arithmetical functions, such as d(n), u(n), 
and ¢(n). We now consider the behaviour of these functions for large 
values of n, beginning with d(n). It is obvious that d(n) > 2 when 
n > 1, while d(n) = 2 if n is a prime. Hence 

THEOREM 313. The lower limit of d(n) as n > œ 18 2: 

lim d(n) = 2. 


no 

It is less trivial to find any upper bound for the order of magnitude 
of d(n). We first prove a negative theorem. 

THEOREM 314. The order of magnitude of dn) is sometimes larger than 
that of any power of logn: the equation 


(18.1.1) dín) = Of(log n)%} 
is false for every A.t 
z _ log n 
If n = 2”, then dn) = m41 ~ fog 
ie log n\? 
If n = (2.3)", then dn) =(m+1? ~ (es) ; 
and so on. If L<A<l+1 
and n = (2.3...) 


then cdm) = (m41)! ~ | log n J” > K (log n), 


log(2. 3...Piyı 
where K is independent of n. Hence (18.1.1) is false for an infinite 
sequence of values of n. 

On the other hand we can prove 

THEOREM 315: d(n) = O(n*) 


\ 


for all positive 8, 

The assertions that d(n) = O(n*), for all positive ô, and that 
d(n) = o(n®), for all positive 6, are equivalent, since n® = o(n5) when 
0< F <À. 

We require the lemma 

THEOREM 316. [/f f(n) is multiplicative, and f (p™) -> 0 as p™ > œ, 
then fin) > 0 as n > œ. 


+ The symbols 0, 0, ~ were defined in § 1.6. 
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Given any positive €, we have 
(i) |/p”)| < A for all p and m, 
(ii) |f(p™)| < 1 if p” > B, 
ii) SP < e if p™ > N(e), 
where A and B are independent of p, m, and «e, and N(e) depends on 
€ only. If n = pepe., = 


then Sn) = SPDS (ps)... f(pe")- 


Of the factors pt, p¥,..., not more ae C are less than or equal to 
B, C being arene of n and e. The product of the corresponding 
factors f(p*) is numerically less than 47, and the rest of the factors 
of f(n) are numerically less than 1. 

The number of integers which can be formed by the multiplication of 
factors p° < NŒ) is M(e), and every such number is less than P(e), 
M(c) and P(e) depending only on e, Hence, if n > P(e), there is at 
least one factor p* of n such that p° > N(e) and then, by (iii), 


IF| < e 
It follows that [f(n)| < Ae 
when n > P(e), and therefore that fin) > 0. 


To deduce Theorem 315, we take f(n) = n-®d(n). Then f(n) is multi- 
plicative, by Theorem 273, and 


flp™) m+l _ 2m _ 2 logp™ 2 log p™ 
P I= p S pmb = p Togp © Tog 2 (p"} 
when p™ ->œ. Hence f(n) > 0 when n + œ, and this is Theorem 315 
(with o for 0). 
We can also prove Theorem 315 directly. By Theorem 273, 


d(n) z ae 
(18.1.2) = t : 
Since aò log 2 < elog2 — 908 < pad, 
a+] 1 
we have ab <1 <las <1 +5Ig3 5 < expres iG 3) 


We use this in (18.1.2) for those p which are less than 248; there are less 
than 218 such primes. If p > 214 we have 


tog Steed 
P 2 2, p® < 9a < l. 
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21/8 
(18.1.3) a Plsiags} <° hsna) - O(1). 


This is Theorem 315. 
We can use this type of argument to improve on Theorem 315. We 
suppose ¢ > 0 and replace § in the last paragraph by 
_ (1+4eJlog 2 
T loglogn . 
Nothing is changed until we reach the final step in (18.1.3) since it is 
here that, for the first time, we use the fact that ô is independent of n. 
This time we have 
loo (2) 2a (logn)"@+%loglogn elog2logn 
eV ne alog2~  —s (1+ de)log?2 5 210glogn 
for all n > n(e) (by the remark at the top of p. 9). Hence 


log2logn (l+e)log 2logn 
<al PIB ATE o S ABl 
e St 8 Bc 2 loglog n loglogn 

We have thus proved part of 


THEOREM 3 17 : lim 


log d(n)loglog n _ log 2: 
logn = , 


that is, if €> 0, then 

d(n) < QiL+elog njloglog n 
for all n > ,(e) and 
(18.1.4) d(n) > 9(1-e)log njloglog n 
for an infinity of values of n, 


Thus the true ‘maximum order’ of ain) is about 


Dlog niloglog n 


It follows from Theorem 315 that 


log d(n) Pa 
log n 
and so dàn) = nog anylogn = nén, 


where €, -> 0 as n - 00. On the other hand, since 
2log niloglog n — ylog 2oglog n 


and loglogn tends very slowly to infinity, «€„ tends very slowly to 0. 
To put it roughly, d(n) is, for some n, much more like a power of n 
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than a power of logn. But this happens only very rarely} and, as 
Theorem 313 shows, d(n) is sometimes quite small. 

To complete the proof of Theorem 317, we have to prove (18.1.4) for 
a suitable sequence of n. We take n to be the product of the first r 
primes, so that 


n = 2.3.5.7...P, d(n) =% = 2P), 


where P is the rth prime. It is reasonable to expect that such a choice 
of n will give us a large value of d(n). The function 


Hx) = X log p 
pst 
is discussed in Ch. XXII, where we shall prove (Theorem 414) that 
B(x) > Ax 
for some fixed positive A and all x > 2.7. We have then 
AP < XP)= 2 logp = logn, 
PTP 


n(P)log P = log P ¥ 1 > #P) = logn, 
ne 
and so 
Igenlog2 _ log n log 2 
log’P loglog n —log A 
, (1-*)lognlog2 
loglog n 


log d(n) = P)log 


for n > ne). 


18.2. The average order of d(n). If f(n) is an arithmetical func- 
tion and g(n) is any simple function of n such that 
(18.2.1) POU)+F(2)-+--+f (0) ~ GI) +... +9(), 
we say that f(n) is of the average order of g(n). For many arithmetical 
functions, the sum of the left-hand side of (18.2.1) behaves much more 


regularly for large n than does f(n) itself. For d(n), in particular, this 
is true and we can prove very preoise results about it. 


Tutorem 318: a@(1)+d(2)+...+d(n) ~ nlogn. 
n 
Since log 1+log 2+... +logn ~ Jost dt ~ nlogn, 


the result of Theorem 318 is equivalent to 
d(1)+d(2)+...+d(n) ~ log 1+log 2+...+logn. 


t See § 22.13. 


{In fact, we prove (Theorem 6 and 420) that (x) ~v, but it is of interest that the 
much simpler Theorem 414 suffices here. 
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We may express this by saying 
THEOREM 319. The average order of d(n) is logn. 
Both theorems are included in a more precise theorem, viz. 
THEOREM 320: : 
d(1)+d(2)+...+d(n) = nlogn+(2y—1)n+O(vn), 
where y is Euler’s constant.t 


We prove these theorems by use of the lattice L of Ch. III, whose 
vertices are the points in the (x, y)-plane with integral coordinates. 
We denote by D the region in the upper right-hand quadrant contained 
between the axes and the rectangular hyperbola xy = n. We count the 
lattice points in D, including those on the hyperbola but not those on 
the axes. Every lattice point in D appears on a hyperbola 

ry=s(l <s <n); 
and the number on such a hyperbola is d(s). Hence the number of 
lattice points in D is d(1)-+d(2)-4...+-d(n). 


Of these points, n = [n] have the x-coordinate 1, [łn] have the 
x-coordinate 2, and SO on. Hence their number is 


[1+ [5] + [5] He Hi = n( +5+--+5)+000 = nlogn+O(n), 


since the error involved in the removal of any square bracket is less 
than 1. This result includes Theorem 318. 
Theorem 320 requires a refinement of the method. We write 


u = [vn], 
so that u? = n+O0(Nn) = n+O(u) 
and logu = log{vn+0(1)} = Hlogn-+-o( =) 


In Fig. 9 the curve GEFH is the reotangular hyperbola xy = n, 
and the coordinates of A, B, C,D are (0, 0), (0, u), (u, %), (u, 0). Since 
(u-+1)? > n, there is no lattice point inside the small triangle ECF; 
and the figure is symmetrical as between x and y. Hence the number 
of lattice points in D is equal to twice the number in the strip between 
AY and DF, counting those on DF and the curve but not those on 

t In Theorem 422 we prove that 

1 1 1 
Ltt. +) log n = r+o(;), 
where y is ą constant, known as Eulers constant. 


18.2] ARITHMETICAL FUNCTIONS 265 


AY, less the number in the square ADCB, counting those on BC and 
CD but not those on AB and AD; and therefore 


n 


> d(l) = »([4| a [5] EEN +E = 2n(1 t5+e +3)—n4+- 00. 


Now (143i) = 2logut2y+0 1, 
2 u oP 

so that 
n 
> d(l) = 2nlogu+(2y—1)n+O(u)-+ of”) = nlogn-+(2y—1)n+ O(n). 
121 

Although 1S 

t = 
oug a 2, d(l) ~ logn, 
it is not true that ‘most’ numbers n have about log n divisors. Actually 
‘almost all’ numbers have about 
(log n)!e8? -= (log n)6-- 

clivisors. The average logn is produced by the contributions of the 
small proportion of numbers with abnormally large d(n).t 

This may be seen in another way, if we assume some theorems of 
Ramanujan. The sum d2(1)-+...+d2(n) 


is of order n(log n)?-! = n(log n)?; 
d®(1)-+...+d3(n) 


f ‘Almost all’ is used in the sense of § 1.6. The theorem is proved in § 22.13. 
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is of order n(logn)*-1 = n(logm)’; and so on. We should expect these 
sums to be of order n(logx)?, n(log n),..., if d(n) were generally of the 
order of logn. But, as the power of d(n) becomes larger, the numbers 
with an abnormally large number of divisors dominate the average 
more and more. 


18.3. The order of o(n). The irregularities in the behaviour of e(n) 
are much less pronounced than those of d(n). 

Since 1|n and n |n, we have first 

THEOREM 321: un) >n. 

On the other hand, 

Tuzorem 322: u(n) = O(n!+) for every positive 6. 


More precisely, 
TuEorEM 323: lim eu = 
n loglog n 

We shall prove Theorem 322 in the next section, but must postpone 
the proof of Theorem 323, which, with Theorem 321, shows that the 
order of u(n) is always ‘very nearly n’, to § 22.9. 


As regards the average order, we have 

Torm 324. The average order of un) is $r?n. More precisely, 
a(1)+o(2)+...t0() = prn? +0(nlogn). 

For o(1)+...Fo0(n) = Dy, 


where the summation extends over all the lattice points in the region 
D of § 18.2. Hence 


z=1 x=1 z=1 
A1 — J 1 1 
Now a aS x Z+ofi) — gre of), 
z=1 z=1 
21 
by (17.2.2), and > -= O(log n). 
z=] 
n 
Hence p) a(l) = rn? + O(nlogn). 
=i 


In particular, the average order of u(n) is }1°n.t 


a 
+ Since z ma~ bn? 
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18.4. The order of ¢(n). The function ¢(n) is also comparatively 
regular, and its order is also always ‘nearly n’. In the first place 


THEOREM 325: b(n) < n ifn >l. 
Next, if n= p™, and p > lje, then 


gm = nf1—2) > ne) 
Hence 


THEOREM 326: lim Lae ee 


There are also two theorems for ¢(n) corresponcling to Theorems 322 
and 323. 


THEOREM 327: pin) + © for every positive ĝ, 
ni 
THEoREM 328: lim P(njloglog n = e7, 
n 


Theorem 327 is equivalent to Theorem 322, in virtue of 
THEOREM 329: A Ş a(n ia S ] 


(for a positive constant A). 


To prove the last theorem we observe that, ifn = [[ p%, then 


prti— I An 1—p--1 


m 2} l ae 
and d(n) = 0 II (1—p-). 
Hence mein = EG —p-*}), 


which lies between 1 a TI (—p-*).f It follows that o(n)/n and 
n/¢(n) have the same order of magnitude, so that Theorem 327 is 
equivalent to Theorem 322. 
To prove Theorem 327 (and so Theorem 322) we write 
ni-d 


fn) = $(n) 


Then fín) is multiplicative, and so, by Theorem 316, it is sufficient to 


prove that fip”) +0 


+ By Theorem 280 and (17.2.2), we see that the A of Theorem 329 is in fact 
{$(2)}> = 60. 
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when p” -> œ. But 


Fo pma) =p 
We defer the proof of Theorem 328 to Ch. XXII. 


18.5. The average order of ¢(n). The average order of ¢(n) is 
6n/7®. More precisely 


THEOREM 330: 


O(n) = A(1)+...4+¢4(n) = an O(nlogn). 


For, by (16.3. 1), 


Il 
Nie 
3 
9 
ps 
se 
+ 
QO 
"An 
3 
Ms 
Qa) 
— 


= 5 He + O(n? S p) +Olmlogn) 


n? _ 3n? 
= XÈ) +0(n)+0(nlogn) = = -+ O(nlogn), 


by Theorem 287 and (17.2.2). 


The number of terms in the Farey series §,, is ®(n)+1, so that an 
alternative form of Theorem 330 is 


Torm 331. The number of terms in the Farey series of order n is 
approximately 3n2/n?. 

Theorems 330 and 331 may be stated more picturesquely in the 
language of probability. Suppose that n is given, and consider all pairs 
of integers (p, q) for which 

q>9 I<paqRn, 
and the corresponding fractions p/q. There are 


Pn = $n(n+1) ~ jn? 
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such fractions, and x,, the number of them which are in their lowest 
terms, is a>(n) If, as is natural, we define ‘the probability that p 
and q are prime to one another’ as 

lim Xn, 

n=>o Yn 
we obtain 

THEOREM 332. The probability that two integers should be prime to one 

another is 6/7*. 


18.6. The number of quadratfrei numbers. An allied problem 
is that of finding the probability that a number should be ‘quadratfrei’, 
i.e. of cletermining approximately the number Q(x) of quaclratfrei 
numbers not exceeding x. 

We can arrange all the positive integers.n <y? in sets S,, S,,..., such 
that §, contains just those n whose largest square factor is d’. Thus 
8, is the set of all quaclratfrei n < y?, The number of n belonging to 


Sa is Q (i 
FE 
and, when d >y, Sais empty. Hence 


=>> QB) 


d<y 


and so, by Theorem 268, 


Replacing y? by x, we obtain 


Tueorem 333. The probability that a number should be quadratfrei is 
6/7®: more precisely 


Q(x) = a+ O(a). 


T Without square factors, g product of different primes: gee § 17.8. 


270 THE ORDER OF MAGNITUDE OF [Chap. XVIII 


A number n is quadratfrei if p(n) = fl, or |u(m)| = 1. Hence an 
alternative statement of Theorem 333 is 


Tueorem 334: 3 je(n)| = zt O(vz). 


It is natural to ask whether, among the quadratfrei numbers, those 
for which p(n) = 1 and those for which p(n) = -1 occur with about 
the same frequency. If they do so, then the sum 


gz 
M(z)= $ p(n) 
n=1 
should be of lower order than x; i.e. 
THEOREM 335: M(x) = 0 (2). 
This is true, but we must defer the proof until § 22.17. 


18.7. The order of r(n). The function r(n) behaves in some ways 
rather like d(n), as is to be expected after Theorem 278 and (16.9.2). 
Ifn = 3 (mod 4), then r(n) = 0. If n = (Pı p,...p,4,)™, and every p is 
4k+-1, then r(n) = 4d(n). In any case r(n) < 4d(n). Hence we obtain 
the analogues of Theorems 313, 314, and 315, viz. 

THEOREM 336: lim r(n) = 0. 

THEOREM 337: r(n) = Of{(log n)4} 
is false for every A. 

THEOREM 338: r(n) = O(n’) 
for every positive ô. 


There is also a theorem corresponding to Theorem 317; the maximum 


order of r(n) is logn 


o2loglogn 
A difference appears when we consider the average order. 
Torm 339. The average order of rn) 18 m; i.e. 
lim See = 


no 
More precisely 


(18.7.1) r(1)+r(2)+...+r(n) = amn+O(vn). 

We can deduce this from Theorem 278, or prove it directly. The direct 
proof is simpler. Since r(m), the number of solutions of x?+y? = m, 
is the number of lattice points of L on the circle x?+y? = m, the sum 
(18.7.1) is one less than the number of lattice points inside or on the 
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circle x?-+ y? = n. If we associate with each such lattice point the lattice 
square of which it is the south-west corner, we obtain an area which is 


included in the circle 
vy? 


x+y? = (Wn—v2)?; 
and each of these circles has an area n+ O(Wn). 


(Wn-+ v2)? 
and includes the circle 


This geometrical argument may be extended to space of any number of dimen- 
sions. Suppose, for example, that 7,(n) is the number of integral solutions of 


rety += n 
(solutions differing only in sign or ọrder being again regarded as distinct). Then 
we can prove 


THEOREM 340: “raliat etrn) = rnit O(n). 
If we use Theorem 278, we have 


[æ] 
rv) = 4 (d) = 4 u), 
1<v<e o) > 2 i idee xí ) 


the sum being extended over all the lattice points of the region D of 
§ 18.2. If we write this in the form 
x 
s Zx S14 > wl], 
KUST l<v<celu = 1<uk1 


we obtain 
TREOREM 341: 


2." iL + ls) 


This formula is true whether x is an integer or not. If we sum 
separately over the regions ADFY and DFX of § 18.2, and calculate 
the second part of the sum by summing first along the horizontal lines 
of Fig. 9, we obtain 


DEO HEER Exe 


U<NE Uva ve<u< zo 
The second sum is O(vz), since > x(u), between any limits, is 0 or +1, 
and 
; 5 es 
>, TH = > x(u)=+O(vn) 
Ug VE 


UgNE 


= {15t A LED) ow) 


oa zlil x} F000) = = }rx+O(va). 


This gives the result of Theorem 339. 
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NOTES ON CHAPTER XVIII 


§ 18.1. } or the proof of Theorem 314 see Pólya and Szegö, ii. 160-1, 386. 

Theorem 317 is due to Wigert, Arkiv för matematik, 3, no. 18 (1907), 1-9 
(Landau, Handbuch, 219-22). Wigert’s proof depends upon the ‘prime number 
theorem’ (Theorem 6), but Ramanujan (Collected papers, 85-86) showed that it 
is possible to prove it in a more elementary way. Our proof is essentially 
Wigert’s, modified SO as not to require Theorem 6. 

§ 18.2. Theorem 320 was proved by Dirichlet, Abhandl, Akad. Berlin (1849), 
69-83 ( Werke, ii. 49-66). 

A great deal of work has been done since on the very difficult problem 
(Dirichlet’s divisor problem’) of finding better bounds for the error in the 
approximation. Suppose that 0 is the lower bound of numbers B such that 


d(t)+-d(2)+...+d(n) = nlogn+(2y—1)n+ O(n). 

Theorem 320 shows that 0 < 4. Voronéi proved in 1903 that 6 < 4, and van der 
Corput in 1922 that § < 23, and these numbers have been improved further by 
later writers. On the other hand, Hardy and Landau proved independently in 
1915 that 0 > }. The true value of @ is still unknown. See also the note on 
§ 18.7. 

As regards the sums d?( 1)+...+@(n), etc., see Ramanujan, Collected papers, 
133-5, and B. M. Wilson, Proc. London Math. Soc. (2) 21 (1922), 235-55. 

§ 18.3. Theorem 323 is due to Gronwall, 7'rans, American Math. Soc. 14 (1913), 


113-22. 
Theorem 324 stands as stated here in Bachmann, Analytische Zahlentheorie, 


402. The substance of it is contained in the memoir of Dirichlet referred to 
under § 18.2. 

§§ 18.45. Theorem 328 was proved by Landau, Archiv d. Math. u. Phys. (3) 
5 (1903), 86-91 (Handbuch, 216-19); and Theorem 330 by Mertens, Journal für 
Math. 77 (1874), 289-338 (Landau, Handbuch, 578-9). 

§ 18.6. Theorem 333 is due to Gegenbauer, Denkschriften Akad. Wien, 49, Abt. 
1 (1885), 37-80 (Landau, Handbuch, 580-2). 

Landau /Handbuch, ii. 588-90] showed that Theorem 335 follows simply from 
the ‘prime number theorem’ (Theorem 6) and later /Sitzungsberichte Akad. Wien, 
120, Abt. 2 (1911), 973-88] that Theorem 6 follows readily from Theorem 335. 

§ 18.7. For Theorem 339 see Gauss, Werke, ii. 272-5. 

This theorem, like Theorem 320, has been the starting-point of a great deal 
of modern work, the aim being the determination of the number @ corresponding 
to the @ of the note on § 18.2. The problem is very similar to the divisor problem, 
and the numbers 4, 4, $ occur in the same kind of way; but the analysis required 
is in some ways a little simpler and has been pushed a little farther. See Landau, 
Vorlesungen, ii. 183-308, and Titchmarsh, Proc. London Math. Soc. (2) 38 (1935), 
96-115 and 555. 

For a general elementary method of calculating the ‘average order’ of arith- 
metical functions belonging to a wide class and for further references to the 
literature, see Atkinson and Cherwell, Quarterly Journal of Math. (Oxford), 20 
(1949), 65-79. 


XIX 
PARTITIONS 


19.1. The general problem of additive arithmetic. In this and 
the next two chapters we shall be occupied with the additive theory of 
numbers. The general problem of the theory may be stated as follows. 

Suppose that A or lis Ag, Agy va 
is a given system of integers. Thus A might contain all the positive 
integers, or the squares, or the primes. We consider all possible repre- 
sentations of an arbitrary positive integer n in the form 

N = G+, ++. +4, 
where s may be fixed or unrestricted, the a may or may not be neces- 
sarily different, and order may or may not be relevant, according to 
the particular problem considered. We denote by r(n) the number of 
such representations. Then what can we say about r(n) ? For example, 
is r(n) always positive? Is there always at any rate one representation 
of every n ? 


19.2. Partitions of numbers. We take first the case in which A 
is the set 1, 2, 3,... of all positive integers, s is unrestricted, repetitions 
are allowed, and order is irrelevant. This is the problem of ‘unrestricted 
partitions’, 

A partition of a number n is a representation of n as the sum of any 
number of positive integral parts. Thus 


5 = 44) = 342 = 34141 = 24241 = 2414141 

= 141414141 
has 7 partitions.? The.order of the parts is irrelevant, SO that we may, 
when we please, suppose the parts to be arranged in descending order 


of magnitude. We denote by p(n) the number of partitions of n; thus 


p(5) = 7. 
We can represent a partition graphically by an array of dots or 


‘nodes’ such as 


A 


+ We have, of course, to count the representation by One part only. 
5591 T 
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the dots in a row corresponding to a part. Thus A represents the 


partition 744434341 
of 18. 
We might also read A by columns, in which case it would represent 


of 18. Partitions related in this manner are said to be conjugate. 

A number of theorems about partitions follow immediately from this 
graphical representation. A graph with m rows, read horizontally, 
represents a partition into m parts; read vertically, it represents a 
partition into parts the largest of which is m. Hence 

Torem 342. The number of partitions of n into m parts is equal to 
the number of partitions of n into parts the largest of which- is m. 

Similarly, 

Tuzorem 343. The number of partitions of n into at most m parts is 
equal to the number of partitions of n into parts which do not exceed m. 

We shall make further use of ‘graphical’ arguments of this character, 


but usually we shall need the more powerful weapons provided by the 
theory of generating functions. 


19.3. The generating function of p(n). The generating functions 
which are useful here are power seriest 
F(x) = > f (nya. 
The sum of the series whose general coefficient is f(n) is called the 
generating function off(n), and is said to enumerate f (n). 
The generating function of p(n) was found by Euler, and is 


I 


(19.3.1) F(x) Z Tae 2°)... z 1+ 2 p(n)a™. 


We can see this by writing the infinite product as 

(+++...) 

(1+2?-+a4-+...) 

(1-+25+-a8+...) 
and multiplying the series together. Every partition of n contributes 
just 1 to the coefficient of g”, Thus the partition 


10 = 342424241 


f Compare § 17.10. 
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corresponds to the product of x? in the third row, gê = x?+?+? in the 
second, and x in the first; and this product contributes a unit to the 
coefficient of x1. 
This makes (19.3.1) intuitive, but (since we have to multiply an in- 
finity of infinite series) some development of the argument is necessary. 
Suppose that 0 < x < 1, so that the product which defines F(x) is 
convergent. The series 


lrt? t.. Lparttettior o ltamtamt.... 
are absolutely convergent, and we can multiply them together and 
arrange the result as we please. The coefficient of æ” in the product is 


Pal”), 
the number of partitions of n into parts not exceeding m. Hence 
(19.3.2) Fale) _ ae = 1+ 2 Pae 
It is plain that 
(19.3.3) Pal) < p(n), 
that 
(19.3.4) P(r) = p(n) 
for n < m, and that 
(19.3.5) Pml) > p(n), 
when m > œ, for every n. And 
(19.3.6) F(z) = 1+ $ pin)” + 5 Py (n)x™. 
n=1 mt+1 
The left-hand side is less than F(x) and tends to F(x) when m > œ. 
Thus 1+ Š pine" < Fale) < Fee), 
n= 


which is independent of m. Hence > p(n)z™ is convergent, and so, after 
(19.3.3), $ Palin)” converges, for any fixed x of the range 0 < x < 1, 
uniformly for all values of m. Finally, it follows from (19.3.5) that 


I+ È plne" = lim(1+ È palna") = lim Falt) = FŒ). 
Incidentally, we have proved that 
I 
(l—x)(1—2?)...(1—2™) 
enumerates the partitions of n into parts which do not exceed m or 
(what is the game thing, after Theorem 343) into at most m parts. 


(19.3.7) 
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We have written out the proof of the fundamental formula (19.3.1) 
in detail. We have proved it for 0 < x < 1, and its truth for xl < 1 
follows at once from familiar theorems of analysis. In what follows 
we shall pay no attention to such ‘convergence theorems’,f since the 
interest of the subject-matter is essentially formal. The series and 
products with which we deal are all absolutely convergent for small x 
(and usually, as here, for a < 1). The questions of convergence, 
identity, and so on, which arise are trivial, and can be settled at once 
by any reader who knows the elements of the theory of functions. 


19.4. Other generating functions. It is equally easy to find the 
generating functions which enumerate the partitions of n into parts 
restricted in various ways. Thus 


l 
enumerates partitions into odd parts; 

l 
(1—z?)(1—zt)(1— z$)... 
partitions into even parts; 

(19.4.3) (1+-x)(1-+a2)(1-+ a)... 

partitions into unequal parts; 

(19.4.4) (l+2)(1-+-23)(1+25)... 

partitions into parts which are both odd and unequal; and 
1 

(1—2)(1—2*)(1—2)(1 —2)...’ 


where the indices are the numbers 5m-+1 and 5m-+4, partitions into 
parts each of which is of one of these forms. 
Another function which will occur later is 


(19.4.1) 


(19.4.2) 


(19.4.5) 


aN 


(19.4.6) (1—2)(1—a4)...(1— 2) 


This enumerates the partitions of n-N into even parts not exceeding 
2m, or of 4(n—N) into parts not exceeding m; or again, after Theorem 
343, the partitions of 4(n—) into at most m parts. 

Some properties of partitions may be deduced at once from the forms 


+ Except once in § 19.8, where again we are concerned with a fundamental identity, 
and once in § 19.9, where the limit process involved is less obvious, 
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of these generating functions. Thus 
l—2? 1—at 1—2 
1—a% 1—2? 1-23” 
l 
~ alse). 


(19.4.7) (1+-a)(1-+a?)(1++23)... = 


Hence 


THEOREM 344. The number of partitions of n into unequal parts is 
equal to the number of its partitions into odd parts. 

It is intoresting to prove this without the use of generating functions. Any 
number | can be expressed uniquely in the binary scale, i.e. as 

T= Q41 9191 (0 casb< Gu) 
Hence a partition of 7 into odd parts can be written as 
n= L,.1+h.3+h.5+... 
= (244 Qhrt....)L4 (Qt 4 Qt B+ (284.5 +005 

and there is a (1, 1) correspondence between this partition and the partition into 


the unequal parts 


291, i, 299.3, 22.3 aay 203,5, 209.5, 


19.5. Two theorems of Euler. There are two identities due to 
Euler which give instructive illustrations of different methods of proof 
used frequently in this theory. 


THEOREM 345: 
(1+ )(1+-23)(1+2°)... 
qå x? 
Ss Gea ee 
Tuzorem 346: 
(1-+-a?)(1+24)(1+-24)... 
2 6 12 
= le Ge EE eE 


In Theorem 346 the indices in the numerators are 1.2, 2.3, 3.4,.... 


G) We first prove these theorems by Eulers device of the intro- 


duction of a second parameter a. 
Let 


K(a) = K(a,x)= (1+ax)\(l+az)(1+az*)... = 1+c,a+c,a?+..., 
t This is the arithmetic equivalent of the identity 
(CELIH HNH) po. 
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where c, = ¢,{z) is independent of a. Plainly 
K(a) = (1 +ax)K (az?) 
or l+ce,a+e,a?+4... = (1-+ax)(1+c,ax?+c,a%x'+...), 
Hence, equating coefficients, we obtain 


Cy = LAC, 24, Cy = 6, P+ Cg H4 pey Coy = Cyy_y V1 4-6, 2 ,..., 


gem gl t84..4(2m -1) 
ant 80 Sm = Tans = aaa), 1a) 
am 


It follows that 

(19.51) (1tasi partar). = 14% 4 
1—a? ' (1—2?)(1—24) , 
and Theorems 345 and 346 are the special cases a = 1 and a = x. 

Gi) The theorems can also be proved by arguments independent of 
the theory of infinite series. Such proofs are sometimes described as 
‘combinatorial’. We select Theorem 345. 

We have geen that the left-hand side of the identity enumerates 


partitions into odd and unequal parts: thus 
15 = 114+3+1 = 9+5+1 = 74543 
has 4 such partitions. Let us take, for example, the partition 11-+3+ 1, 


and represent it graphically as in B, the points on one bent line corre- 
sponding to a part of the partition. 


Sd 


ee ee 


B C D 
We can also read the graph (considered as an array of points) as 

in C or D, along a series of horizontal or vertical lines. The graphs 

C and D differ only in orientation, and each of them corresponds to 
another partition of 15, viz. 6+3+3+1+1+ 1. A partition like this, 

symmetrical about the south-easterly direction, is called by Macmahon 
a self -conjugate partition, and the graphs establish a (1, 1) correspondence 
between self-conjugate partitions and partitions into odd and unequal 
parts. The left-hand side of the identity enumerates odd and un- 
equal partitions, and therefore the identity will be proved if we can 
show that its right-hand side enumerates self-conjugate partitions. 
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Now our array of points may be read in a fourth way, viz. as in E. 


Here we have a square of 3? points, and two ‘tails’, each representing 
a partition of 4(15—3?) = 3 into 3 parts at most (and in this particular 

case all 1’s), Generally, a self-conjugate partition of n can be read as 
a square of m? points, and two tails representing partitions of 


3(n—m*) 
into m parts at most. Given the (self-conjugate) partition, then m and 
the reading of the partition are fixed; conversely, given n, and given 
any square m? not exceeding n, there is a group of self-conjugate parti- 
tions of n based upon a square of m? points. 
gm? 

(1—x?)(1—zx$)...(1 —x?™) 

is a special case of (19.4.6), and enumerates the number of partitions 
of 4(n—m?) into at most m parts, and each of these corresponds as we 


have seen to a self-conjugate partition of n based upon a square of m? 
points. Hence, summing with respect to m, 


Now 


Cs) am 
IE > (1—a?)(1— 24)... (1—a?™) 


enumerates all self-conjugate partitions of n, and this proves the 
theorem. 
Incidentally, we have proved 


THEOREM 347. The number of partitions of n into odd and unequal 
parts is equal to the number of its self-conjugate partitions. 


Our argument suffices to prove the more general identity (19.5.1), 
and show its combinatorial meaning. The number of partitions of n 
into just m odd and unequal parts is equal to the number of self-con- 
jugate partitions of n based upon a square of m? points. The effect of 
putting a = 1 is to obliterate the distinction between different values 
of m. 

The reader will find it instructive to give a combinatorial proof of 
Theorem 346. It is best to begin by replacing x? by x, and to use the 
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decomposition 1+2+3+...+m of 4m(m+1). The square of (ii) is 
replaced by an isosceles right-angled triangle. 


19.6. Further algebraical identities. We can use the method (i) 
of § 19.5 to prove a large number of algebraical identities. Suppose, for 
example, that 


Kia) = K;(a,x) = (1+ax)(1+az*)...(1+-az!) = bs cm a". 


4 


Then (1+azi+)K;(a) = (1-+an)K,(az). 
Inserting the power series, and equating the coefficients of gm, we 
obtain Cn +em-i WH = (Cmt Cna) 

or (1 —a2™)c,, = (2 — aiH )e p = L iHe hi 


for 1 < m <j. Hence 


THEOREM 348: 

(1+ax)(1+az?)...(1-+azi) = 14as ee 

1—a/)...(1—as-m+) 
(1—2)...(1—a™) 


Z ated 


+. famatmn+) ( we taighiG+D, 


If we write x? for x, 1/a for a, and make j -> oo, we obtain Theorem 345. 
Similarly we can prove 


THEOREM 349: 


(Tax) aa)... 1 —aat) = te tO ae 


In particular, if we put a = 1, and make j > ©, we obtain 
THEOREM 350: 
1 x r? 


"Era ada) 


(ai. — Sat tan 


19.7. Another formula for F(x). As a further example of ‘com- 
binatorial’ reasoning we prove another theorem of Euler, viz. 
Turorem 35 1: 
1 
at x2 
Capa A aaa 


+ Fe 


x 
(l—2x)? 
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The graphical representation of any partition, say 


contains a square of nodes in the north-west corner. If we take the 
largest such square, called the ‘Durfee square’ (here a square of 9 nodes), 
then the graph consists of a square containing 7? nodes and two tails; 
one of these tails represents the partition of a number, say I, into not 
more than į parts, the other the partition of a number, say m, into 
parts not exceeding 7; and 


n = i+l+m. 
In the figure n = 20, i= 3, l= 6, meb. 


The number of partitions of J (into at most 7 parts) is, after § 19.3, 
the coefficient of 2! in 


1 
(l—2x)(1—2?)...(1—2#)’ 


and the number of partitions of m (into parts not exceeding 1) is the 
coefficient of x™ in the same expansion. Hence the coefficient of x"-“ in 
1 2 


(gases as 


x? 


or of z” in at) 
(1—2a)?(1—2?)?...(1—at)? 


is the number of possible pairs of tails in a partition of n in which the 
Durfee square is 12. And hence the total number of partitions of n is 
the coefficient of z” in the expansion of 


l x at y? 
Ta (aie tT maaan 


This proves the theorem. 
There are also simple algebraicalt proofs. 


t We use the word ‘algebraical’ in its old-fashioned sense, in which it includes ele- 
mentary manipulation of power series or infinite products, Such proofs involve (though 
sometimes only superficially) the use of limiting processes, and are, in the strict sense 
of the word, ‘analytical’ ; but the word ‘analytical’ is usually reserved, in the theory 
of numbers, for proofs which depend upon analysis of a deeper kind (usually upon the 
theory of functions of a complex variable). 
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19.8. A theorem of Jacobi. We shall require later certain special 
cases of a famous identity which belongs properly to the theory of 
elliptic functions. 

THEOREM 352. If |x| < 1, then 


(19.8.1) Il {(1—a)( 14 a2n—1z)(1-4.42n-12-1)} 
n=1 
= 1+ a eas ad = ae 


for all z except z = 0 
The two forms of the gerjieg are obviously equivalent. 


Let us write P(x, 2) = Q(z) R(x, 2) R(x, 271), 
where Q(x) = J] 0—2),  R(æ, 2) = J]. a +222), 
n=1 n=1 


When |z| < 1 and z +0, the infinite products 


Tho+em, Ñ ater), Å (+e) 


are all convergent. Hence the products Q(x), R(x, z), R(x, 2-1) and the 
product P(x,z) may be formally multiplied out and the resulting terms 
collected and arrangea in any way we please; the resulting geries is 
absolutely convergent and its sum is equal to P(x, z). In partidar, 
P(z,z) = 5 a,,(x)z", 
N=— © 

where a,(x) does not depend on z and 
(19.8.2) a-(x) = a,(x). 

Provided x + 0, we can easily verify that 

(1+22z)R(x, zx?) = R(x, z), R(x, z-1x-2) = (1 +2721) R(x, 274), 
so that zzP(x,zx?) = P(x,z), Hence 


it 


>) eet (x)ertt = 
n=-0 


Since this is true for all values of z cate z= 0) we can equate the 
coefficients of z” and find that a, ,,(x) = x?”+a,(x). Thus, for n > 0, 
we have On g1() = ODR- 4ta (ar) = +da (2), 

By (19.8.2) the same is true when n+l < 0 and SO a,(x) = 2”’a,(x) for 
all n, provided x Æ 0. But, when x = 0, the result is trivial. Hence 
(19.8.3) P(x, z) = ap(x)S(a, 2), 
where S(x, z) =- > az, 

n= — o 


To complete the proof of the theorem, we have to show that a,,(x) = 1. 
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If z has any fixed value other than zero and if |x| < 4 (say), the 
products Q(x), R(x, z), R(x, 2-1) and the series S(x, z) are all uniformly 
convergent with respect to x. Hence P(x, z) and S(x, z) represent 
continuous functions of x and, as x > 0, 

P(x, 2) > P(0,2)= 1, S(x, z) > S(0,z) = 1. 


It follows from (19.8.3) that a,(x) > 1 as z > 0. 
Putting z = 7, we have 


(19.8.4) S(x, i) = 


i ina 


ne yer’ = S(x4, -1). 


Again 
R(x, i) Re, oe 1 Hix?" 1 —ix?n-1)) - IT ( 1-in-2), 
x) = [I 0—2) = T] (01—21 — a1), 


n=1 n=1 
and so 


(19.8.5) )= TL 1 —x!")( 1 —a8n-4)} 


n=1 
= Tra —8n)( = x8"-4)2} = P(x, -1). 


Clearly P(a*, = 1) 40, and so it follows from (19.8.3), (19.8.4), and 
(19.85) that @ (x) = a,(x4). Using this repeatedly with zt, x”, x**,... 
replacing x, we have 


k 
a(z) = a(z) =. . . = a,(2*) 
for any positive integer k. But |x| < 1 and so z4 > 0 as k -> 00, Hence 


a,(x) = lima,(z) = 1. 
(2) pea) 


This completes the proof of Theorem 352. 


19.9. Special cases of Jacobi’s identity. If we write 2* for x, 
—x and x for z, and replace n by n+ 1 on the left-hand side of (19.8.1), 
we obtain 


(19.9.1) Tl {a — g2kn+k-l)( ]—a2kntktt)(] —atknt2kyy = > ( — L)tgkn? Hn, 


n=- o 


(19.9.2) TI (tate (1 g2hntksty } —z?kn+2k)) = $ gin Hn, 


n=— 0 
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Some special cases are particularly interesting. 
G) k= 1, 1 =0 gives 


i 1 — 2 +1)2/ 1—g?”+2)} = $ (— Lre, 
n= N=—0 


two standard formulae from the theory of elliptic functions. 
(ii) k = 3,1 = 4 in (19.9.1) gives 
foe] 
> (—1)%airBn+) 


n=- o 


Th {(1 —gSn+l) )(1 —a8n42)(] — genre) — 


or 
THEOREM 353 : 
(1—x)(1—2z?)(1—28).. =A (— 1era, 
This famous identity of Euler may a x written in the form 


(19.9.3) (1—a)(1—22)(1—29)... = 14 È (—1)fainon 4 giman+D) 
nal 


l—a—a2?2+a5+e?—gPh_—ght |. 


(iii) k = l aan 4 in (19.9.2) gives 
Ü (Oaia E gna, 
n=0 


n= — %0 
which may be transformed, by use of (19.4.7), into 


THEOREM 354: 
(1—2*)(1—a#)(1—2")... 
VN A A eT 6 [4-10 
(d=. ee 
Here the indices on the right are the triangular numbers. t 
(iv) k= 3,1 = 3 and k= 3,1 = 4 in (19.9.1) give 


THEOREM 355 : 
Ih 1—gint) (l— — 5044) (1 — abn +5)} $ (—1)"ainGn+3), 
THEOREM 356 : 


Il (Lahr?) (1 ah) Tee) > (—1)"abnn+)), 


n=0 
We shall require these formulae later. 
+ The numbers $n(n+ 1). 
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As a final application, we replace x by zx! and z by wf in (19.8.1). 
This gives 


TI ( 1—a2")(1+2"2)(1 +a"-1¢-1)} = S ai 
OP 


ia} 


(1+6 JIR (1—a")(14-2"f)(1+a"{-1) J = È (+ eim, 


where on the right-hand side we have combined the terms which 
correspond to n = m and n = -m- 1. We deduce that 


(19.9.4) IT {(1 —a™)(1 4-a"£)(1 +a"{-1)}= Dent ree 


= Š gimmie.. + 2m) 


o 
for all ý except € = 0 and = — 1. We now suppose the value of x 


fixed and that ¢ lies in the closed interval — 3 < ¢ < —}. The infinite 
product on the left and the infinite series on the right of (19.9.4) are 
then uniformly convergent with respect to . Hence each represents 
a continuous function of ¢ in this interval and we may let (> 1. 
We have then 


THEOREM 357: 
TI (—2"8 = X (—1)"(2m-+ l )jeimm+D, 
n=1 m=0 


This is another famous theorem of Jacobi. 


19.10. Applications of Theorem 353. Euler’s identity (19.9.3) 
has a striking combinatorial interpretation. The coefficient of ~” in 
(l—x)(1—z*)(1—2)... 
is 
(19.10.1) > (-)”, 


where the summation is extended over all partitions of n into unequal 
parts, and y is the number of parts in such a partitjon. “Thus the parti- 
tion 3+2+ 1 of 6 contributes (— 1)? to the coefficient of z6. But (19.10.1) 
is E(n)- U(n), where E(n) is the number of partitions of n into an even 
number of unequal parts, and U(n) that into an odd number. Hence 
Theorem 353 may be restated as 


Torm 358. E(n) = U(n) except when n = 4k(3k+1), when 
E(n)- Um) = (— 1%, 
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Thus 7 = 6+1 = 542 = 443 = 44241, 
E(7) = 3, W7) = 2, E(7)-C(7) = 1, 
and 7 =$.2.(3.24+1), k=2 


The identity may be used effectively for the calculation of y(n), For 
y p 


s l—r— r? Htr... 
(rr +r +a...) 14 > pina") = =r =1 
Hence, equating coefficients, 
(19. 10.2) p(n)—p(n—1)—p(n—2)4+ p(n—5)+... 
+(—1)tp{n—Je(3k—D}+(—1)kpfn—4h(3k-+ D)}+... = 0. 
The number of terms on the left is about 24¢&n) for large n. 
Macmahon used (19.10.2) to calculate p(n) up to n = 200, and found 


that (200) = 3972999029388. 


19.11, Elementary proof of Theorem 358. There is a very beauti- 
ful proof of Theorem 358, due to Franklin, which uses no algebraical 
machinery. 

We try to establish a (1,1) correspondence between partitions of 
the two sorts considered in § 19.10. Such a correspondence naturally 
cannot be exact, since an exact correspondence would prove that 
E(n) = U(n) for all n. 

We take a graph G representing a partition of n into any number 
of unequal parts, in descending order. We gall the lowest line AB 


a 
E EE TEE E A aaea hi os AR 


G H 


(which may contain one point only) the ‘base’ B of the graph. From 
C, the extreme north-east node, we draw the longest south-westerly line 
possible in the graph; this also may contain one node only. This line 
CDE we call the ‘slope’ g of the graph. We write B < g when, as 
in graph G, there are more nodes in g than in f, and use a similar 
notation in other cases. Then there are three possibilities. 

(a) B <o, We move £ into a position parallel to and outside o, as 
shown in graph H. This gives a new partition into decreasing unequal 
parts, and into a number of such parts whose parity is opposite to that 
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of the number in G. We call this operation 0, and the converse opera- 
tion (removing g and placing it below £) Q. It is plain that Q is not 
possible, when $ < g, without violating the conditions of the graph. 

(b) 8 = a. In this case 0 is possible (as in graph 1) unless 8 meets g 
(as in graph J), when it is impossible. 922 is not possible in either case. 

(c) B >a. In this case 0 is always impossible. Q is possible (as in 
graph K) unless B meets g and B = o+1 (as in graph L). Q is impos- 
sible in the last case because it would lead to a partition with two equal 
parts. 


To sum up, there is a (1,1) correspondence between the two types 
of partitions except in the cases exemplified by J and L. In the first of 
these exceptional cases n is of the form 


k+(k+1)+...4(2k—1) = 4(8k?—k), 
and in this case there is an excess of one partition into an even number 
of parts, or one into an odd number, according as k is even or odd. In 
the second case n is of the form 


(+I) +(k+2)-4+...$2k = 4(8k+4), 
and the excess is the same, Hence E(n)- U(n) is 0 unless n = $(3k?-4), 
when E(n)- U(n) = (— 1)*. This is Euler’s theorem. 


19.12. Congruence properties of p(n). In spite of the simplicity 
of the definition of p(n), not very much is known about its arithmetic 
properties. 

The simplest arithmetic properties known were found by Ramanujan. 
Examining Macmahon’s table of p(n), he was led first to conjecture, and 
then to prove, three striking arithmetic properties associated with the 
moduli 5, 7, and 11. No analogous results are known to modulus 2 or 3, 
although Newman has found some further results to modulus 13. 


Torm 359:. p(5m-+-4) = 0 (mod 5). 
Tuzorem 360: p(7m+5) = 0 (mod 7). 
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Turorem 36 1 * : p(llm+6) =0 (mod 11). 

We give here a proof of Theorem 359. Theorem 360 may be proved 
in the same kind of way, but Theorem 361 is more difficult. 

By Theorems 353 and 357, 

a{(l—a)(1—a?)...44 = a(1—a)(1—2?)...{(1—2)(1—2?)...}8 
= x(l—x—2?+a5+...)(1—32+523— 728+...) 

= Š $ (12+) 


where k = nee 8) = 14 4r(8r+1)+4$s(s+1). 
We consider in what circumstances k is divisible by 5. 


Now = X(r-+1)?-.(28+1)? = 8k—10r?—5 = 8k (mod5). 
Hence k = 0 (mod 5) implies 
2(r+ 1)?-+ (28+ 1)? = 0 (mod 5). 
Also 2(r+1)? 3 0, 2, or 3, (2s+1)? = 0, 1, or 4 (mod 5), 
and we get 0 on addition only if 2(r+ 1)? and (28+ 1)? are each divisible 
by 5. Hence k can be divisible by 5 only if 2s+ ] is divisible by 5, and 
thus the coefficient of x5™+5 in 
a{(1—x)(1—2?)...}4 

is divisible by 5. 

Next, in the binomial expansion of (1 —2)-5, all the coefficients are 


divisible by 5, except those of 1, x5, z!°,..., which have the remainder 1, 
We may express this by writing 


1 I 
——~ = ——, (modb); 
(l—a)® = 1—2* 
the notation, which is an extension of that used for polynomials in 


§ 7.2, implying that the coefficients of every power of x are congruent. 


( l- X P 57 ; ) 


(1—2#)(1—a!9)(1—z).., 
{(1—x)(1—a*)(1—2')...}9 
Hence the coefficient of x5™+5 in 
(1—a*)(1—a"")... 
(1-2)(1-x3)... 


and = 1 (mod 5). 


(1—a5)(1—a9),.. 


aye 


= x{(1—2)(1—22)...}4 


+ Theorem 76 of Ch. VI. 
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is a multiple of 5. Finally, since 
o g 
(1—x)(1—z?)... 
(1—2°)(1—2")... 
= (Ltab+alt (lta tat)... 
Baya HEHH.) 


the coefficient of 75”+* in 


2x wo 
(=i aeae A es 
is a multiple of 5; and this is Theorem 359. 
The proof of Theorem 360 is similar. We use the square of Jacobi’s 


series 1—32-+5a3—7a°+... instead of the product of Eulers and 
Jacobi ’s series. 


There are also congruences to moduli 5?, 77, and 11?, such as 
p(25m-+ 24) = 0 (mod 5). 


Ramanujan made the general conjecture that if 


8 = 59791 1¢, 
and 24n = 1 (mod 98), 
then p(n) = 0 (mod68). 


It is only necessary to consider the cases 6 = 5%, 7°,11¢, since all others 
would follows as corollaries. 

Ramanujan proved the congruences for 52, 72, 117, Kreémar that for 
53, and Watson that for general 5%, But Gupta, in extending Mac- 
mahon’s table up to 300, found that 


p(243) = 133978259344888 


is not divisible by 73 = 343; and, since 24.243 = 1 (mod 343), this con- 
tradicts the conjecture for 73. The conjecture for 7° had therefore to 
be modified, and Watson found and proved the appropriate modifica- 
tion, viz. that p(n) = 0 (mod 7?) if b > 1 and 24n =1 (mod 7”-?), 

D. H. Lehmer used a quite different method based upon the analytic 
theory of Hardy and Ramanujan and of Rademacher to calculate p(n) 
for particular n. By this means he verified the truth of the conjecture 
for the first values of n associated with the moduli 113 and 114. Subse- 
quently Lehner proved the conjecture for 118, Dr. Atkin informs me 
that he has now proved the conjecture for general 1 1°, but his proof has 
not yet been published. 


Dyson conjectured and Atkin and Swinnerton-Dyer proved certain 
5591 U 
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remarkable results from which Theorems 359 and 360, but not 361, are 
immediate beds Thus, let us define the rank of a partition as the 
larg p a r t - so that, for example, the rank 
of a ERTA ma a of the conjugate partition differ only in sign. 

Next we arrange the partitions of a number in five classes, each class 
containing the partitions whose rank has the same residue (mod5). 
Then, if n =4 (mod 5), the number of partitions in each of the five 
classes is the same and Theorem 359 is an immediate corollary. There 

is a similar result leading to Theorem 360. 


19,13. The Rogers-Ramanujan identities. We end this chapter 
with two theorems which resemble Theorems 345 and 346 superficially, 
but are much more difficult to prove. These are 


THEOREM 362 : 


1 


1 


l ~ a e) (1AL 
t.e. 


re) ym? œ% l 
091381) 14 > aa aea l ea 


THEOREM 363: 


T xê wld 
Vat (ada) t ayaa 7 


l 
= (1—=z4)(1— z7)... (1—21 — z’)... 


4.€. 


gem +3) 2 1 
(19.13.2) 1+ 2 Iari) - lore 


The series here differ from those in Theorems 345 and 346 only in that 
x? is replaced by x in the denominators. The peculiar interest of the 
formulae lies in the unexpected part played by the number 5. 

We observe first that the theorems have, like Theorems 345 and 346 
a combinatorial interpretation. Consider Theorem 362, for example. 
We can exhibit any square m? as 


m? = 1+3+5+...+(2m—1) 
or as shown by the black dots in the graph M, in which m = 4. If we 
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now take any partition of n—m? into m parts at most, with the parts 
in descending order, and add it to the graph, as shown by the circles 
of M, where m = 4 and n = 4?+4 11 = 27, we obtain a partition of n 
(here 27 = 11+-8+6+2) into parts without repetitions or sequences, 
or parts whose minimal difference is 2. The left-hand side of (19.13.1) 


enumerates this type of partition of n. 
i we tape OOK O 
... 000 

» 09 0 9 


M 


On the other hand, the right-hand side enumerates partitions into 
numbers of the forms 5m+ 1 and 5m+ 4. Hence Theorem 362 may be 
restated as a purely ‘combinatorial’ theorem, viz. 


Tueorem 364. The number of partitions of n with minimal difference 
2 is equal to the number of partitions into parts of the forms 5m-+ 1 and 
5m+4. 


Thus, when n = 9, there are 5 partitions of each type, 


9 5 8+1, 7+2, 643, 543-41 
of the first kind, and 


9, 6414141, 44441, 44141414141, 
Pb eee Pa 
of the second. 
Similarly, the combinatorial equivalent of Theorem 363 is 


Tueorem 365. The number of partitions of n into parts not less than 2, 
and with minimal difference 2, is equal to the number of partitions of n 
into parts of the forms 5m-+-2 and 5m-+3. 

We can prove this equivalence in the same way, starting from the 
identity m(m+1) = 24+4464...4+2m. 

The proof which we give of these theorems in the next section was 
found independently by Rogers and Ramanujan. We state it in the 
form given by Rogers. It is fairly straightforward, but unilluminating, 
since it depends on writing down an auxiliary function whose genesis 
remains obscure. It is natural to ask for an elementary proof on some 
such lines as those of § 19.11, and sucha proof was found by Schur; 
but Schur’s proof is too elaborate for insertion here. There are other 
proofs by Rogers and Schur, and one by Watson based on different 
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ideas. No proof is really easy (and it would perhaps be unreasonable 
to expect an easy proof). 


19.14. Proof of Theorems 362 and 363. We write 


We introduce the auxiliary function 


(19.14.1) Hn = Ha) = $ (—1yarNn-mr(]—amgimr) p. Q, 


where m = 0, 1, or 2. Our object is to expand H, and H, in powers of a. 
We prove first that 


(19.14.2) H, —Hp; = anh pn (m = 1,2). 


We have H ,—Hp = > (— yardo nB Rr 


i 
LLMs 


where C 


ee ETMT — qgmymr — yl -myr +a™-lgrm-1) 

= a™-lgrm-I)( 1 —ax") 4er 1 —2"), 
Now = (1 —a2")Q,=Quir, (lL —a)P =P, 1—at=0 
and so 


ao 


Ha—Hni = 2 —1)ra2r+m— 1gAr)+(m— DP. Qrt 


foo) 
+ & CarelmP,_, Q,. 
In the second sum on the right-hand side of this identity we change r 
into r+], Thus 3 
Ay = -1 = 2, (—1) Dy BQ, 


where D . = a?rtm-iyàr)+rm-1)— qAr+DyAr+1)-mr+1) 


mr 


= am—1+2ryNr) +rm—1)/ 1—q3-m ap(2r+1X3—m)) 


= am-in farrghr)—r8 —m) (l— g3—myz2r(3 —m))}, 
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since A(r-+1)—A(r) = 5r43. Also Q,,, = nQ, and so 
H, — H,, j= qm- ly Š- 1 yra? gNr) -n3 my l+ —a3-mg?r3-m) P, Q, 


= iene 
which is (19.14.2). 
If we put m = 1 and m = 2 in (19.14.2) and remember that H, = 1, 
we have 


=m) 


(19.14.38) H - nh, 
H,—H, = anh, 

so that 

(19.14.4) H, = H, +an?h,. 


We use this to expand H, in powers of a. If 
H, = eotea t... = ¥ ca’, 
where the ¢, are independent of a, then ¢y = 1 and (19.14.4) gives 
¥ 6,0 =Y eaat + Dc, cast 


Hence, equating the coefficients of g*, we have 


1 gs2 g2+4+.+A8—1) ey 
C= i oe eG S -DP 
TIe o ot > al) = : 
Hence Hla) = ¥ ate-DP.. 
$=0 


If we put a = x, the right-hand side of this is the series in (19.13.1). 
Also P.Q,(x) = P, and so, by (19.14.1), 


Pa $ ( =] rary 1 —a%2r+1)) 
= Pd. FD <] rar > (—1 Poe oraa 


= Polt. > (= L) (grr +D + girr—m)}, 
Hence, by Theorem 356, 


H(z) = P, Ñ, {(1 —an+2)(] —gë52+8)(1] — e50 +5)) 


l 
a __ Fönt] __75n44)" 
lla bn+h)( |] — gn +4) 
This completes the proof of Theorem 362. 
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Again, by (19.14.3), 
H,(a) = yH,(a) = Hlaz) = > were, 
and, for d = x, the right-hand side becomes the series in (19.13.2). 


Uaing (19.14.1) and Theorem 355, we complete the proof of Theorem 
363 in the same way as we did that of Theorem 362. 


19.15. Ramanujan’s continued fraction. We can write (19.14.14) 


in the form H,(a, x) = H,(ax, x) +aH,(ax?, x) 
so that Hax, x) = H,(ax, x)+axH,(az', x). 
Hence, if we define F(a) by 

F(u) = F(a, x)= H,(a, x)= 7H,(a, x)= H,(ax, x) 

ax axt 
Stie (aay 
then F(u) satisfies 
F(ax”) = F(ax™+!)+ax"+1F (ax"*?), 


: F(ax”) 
Hence, if. Un = Farry 
ar 
we have Up = =142 
Und 
and hence u, = F(a)/F(ax) may be developed formally as 
F(a) ax ax? arè, 
19.15.1 = 
í ) F(ax) 14+ 141-4.., 


a ‘continued fraction’ of a different type from those which we con- 
sidered in Ch. X. 

We have no space to construct a theory of such fractions here. It is 
not difficult to show that, when |z| < 1, 


ax ax® ag” 
Tee T 
tends to a limit by means of which we can define the right-hand side 
of (19.15.1). If we take this for ena we have, in particular, 

FQ) z 


Fej Ihr a To 


1+ 


and so 


1 we t 4+ aft. (1—2?)(1—2”)...(1—2)(1—28)... 
TIF IF.. Ica pap (1a). (N). 


It is known from the theory of elliptic functions that these products 
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and series can be calculated for certain special values of x, and in 
particular when a az eninvh 


and h is rational. In this way Ramanujan proved that, for example, 
i le e a h a) N EI i 
l+ 14 1+... 2 2 


NOTES ON CHAPTER XIX 


§ 19.1. There are general accounts of the theory of partitions in Bachmann, 
Niedere Zahlentheorie, ii, ch. 3; Netto, Combinatorik (second ed. by Brun and 
Skolem, 1927); Maemahon, Combinatory analysis, ii. 

§§ 19.3-7. Almost all of the formulae of these sections are Euler’s. For references 
see Dickson, History, ii, ch. 3. 

§ 19.8. Jacobi, Fundamentu noua, § 64. The theorem was known to Gauss. 
The proof given here is ascribed to Jacobi by Enneper; Mr. R. F. Whitehead drew 
our attention to it. 

$ 19.9. Theorem 353 is due to Euler; for references see Bachmann, Niedere 
Zahlentheorie, ii. 163, or Dickson, History, ii. 103. Theorem 354 was proved by 
Gauss in 1808 ( Werke, ii. 20), and Theorem 357 by Jacobi (Fundamenta noua, § 66). 
Professor D. H. Lehmer suggested the proof of Theorem 357 given here. 

§ 19.10. Macmahon’s table is printed in Proc, London Math. Soc. (2) 17 (1918), 
114-15, and has subsequently been extended to 600 (Gupta, ibid. 39 (1935), 142-9, 
and 42 (1937), 546-9), and to 1000 (Gupta, Gwyther, and Miller, Roy. Soc. Math. 
Tables 4 (Cambridge, 1958). 

§ 19.11. F. Franklin, Comptes rendus, 92 (1881), 448-50. 

§ 19.12. See Ramanujan, Collected Papers, nos. 25, 28, 30. These papers cou- 
tain complete proofs of the congruences to moduli 5, 7, and 11 only. On p. 213 
he states jdentities which involve the congruences to moduli 5? and 7? as corol- 
laries, and these identities were proved later by Darling, Proc, London Math. 
Soc. (2) 19 (1921), 350-72, and Mordell, ibid. 20 (1922), 408-16. A manuseript 
still unpublished contains alternative proofs of these congruences and one of the 
congruence to modulus 1 1?, See also Newman, Can. Journ. Math. 10 (1958), 577- 
86. 

The papers referred to at the end of the section are Gupta’s mentioned under 
§ 19.10; Kreémar, Bulletin de l’acad, des sciences de l’U RSS (7) 6 (1933), 763-800; 
Lehmer, Journal London Math. Soc. 11 (1936), 114-18, and Bull. Amer, Math, 
Soc. 44 (1988), 84-90; Watson, Journal für Math. 179 (1938), 97-128; Lehner, 
Proc, Amer. Math. Soc. 1 (1950), 172-81; Dyson, Eureka 8 (1944), 10-15; Atkin 
and Swinnerton-Dyer, Proc, London Math. Soc. (3) 4 (1954), 84-106. 

There has been a good deal of recent work on this and related topics. See in 
particular the following papers, and the references therein: Fine, Tohoku Math. 
Journ, 8 (1956), 149-64; Kolberg, Math. Scand. 10 (1962), 171-81; Lehner, Amer. 
Journ, Math. 71 (1949), 373-86; Newman, Trans. Amer. Math. Soc. 97 (1960), 
225-36, Illinois Journ. Math. 6 (1962), 59-63; as well as the papers of Lehner 
and Newman already referred to. 

1 am indebted to Dr. Atkin for the references to recent work. 
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§§ 19.13-14. For the history of the Rogers-Ramanujan identities, first found 
by Rogers in 1894, see the note by Hardy reprinted on pp. 344-5 of Ramanujan’s 
Collected papers, and Hardy, Ramanujan, ch. 6. Schur’s proofs appeared in the 
Berliner Sitzungsberichte (1917), 302-21, and Watson’s in the Journal London 
Math. Soc. 4 (1929), 4-9. Hardy, Ramanujan, 95-99 and 107-11, gives other 
variations of the proofs. 


Selberg, Avhandlinger Norske Akad. (1936), no. 8, has generalized the argu- 
ment of Rogers and Ramanujan, and found similar, but less simple, formulae 
associated with the number 7. Dyson, Journal London Math. Soc. 18 (1943), 
35-39, has pointed out that these also may be found in Rogers’s work, and has 
simplified the proofs considerably. 

Mr. C. Sudler suggested a substantial improvement in the presentation of the 
proof in § 19.14. 


XX 


THE REPRESENTATION OF A NUMBER BY TWO OR 
FOUR SQUARES 


20.1. Waring’s problem: the numbers g(k) and G(k), Waring’s 
problem is that of the representation of positive integers as sums of a 
fixed number s of non-negative kth powers. It is the particular case of 
the general problem of § 19.1 in which the a are 

OF, 1%, 2k, 3k.. ‘i 
and şs is fixed. When k = 1, the problem is that of partitions into s 
parts of unrestricted form; such partitions are enumerated, as we saw 


in Ch. XIX, by the function 
1 


(1—2x)(1—2?)...(1—28)" 
Hence we take } > 2. 

It is plainly impossible to represent all integers if s is too small, for 
example if s = 1. Indeed it is impossible if g < k, For the number of 
values of 2, for which x? < n does not exceed n/#+ 1; and so the 
number of sets of values 2, %,..., Ly, for which 


e+ tak <n 


(nME 4 1)¥-t = nk 4- O(n-2k), 
Hence most numbers are not representable by ķk— 1 or fewer kth powers. 


The first question that arises is whether, for a given k, there is any 
fixed s = s(k) such that 


(20.1.1) n = wk+okt tak 
is soluble for every n. 


does not exceed 


The answer is by no means obvious. For example, if the a of § 19.1 are the 
numbers 1, 2, 2.) Danny 
then the number gmtl_} = 1424+24... 42" 
is not representable by less than m-+ 1 numbers a, and m+ 1 > © when 
n= Q™*1_ 1 -> œ, Hence it is not true that all numbers are representable by 
a fixed number of powers of 2. 

Waring stated without proof that every number is the sum of 4 
squares, of 9 cubes, of 19 biquadrates, ‘and so on. His language implies 
that he believed that the answer to our question is affirmative, that 
(20.1.1) is soluble for each fixed k, any positive n, and an s= s(k) 
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depending only on k. It is very improbable that Waring had any 
sufficient grounds for his assertion, and it was not until more than 100 
years later that Hilbert first proved it true. 

A number representable by 8 kth powers is plainly representable by 
any larger number. Hence, if all numbers are representable by s kth 
powers, there is a least value of s for which this is true. This least value 
of s is denoted by g(k). We shall prove in this chapter that g(2) = 4, 
that is to say that any number is representable by four squares and 
that four is the least number of squares by which all numbers are 
representable. In Ch. XXI we shall prove that g(3) and g(4) exist, 
but without determining their values. 

There is another number in some ways still more interesting than 
g(k). Let us suppose, to fix our ideas, that k = 3. It is known that 
g(3) = 9; every number is representable by 9 or fewer cubes, and every 
number, except 23 = 2. 28+ 7. L and 

239 = 2,.48+4.33+3.13, 
can be represented by 8 or fewer cubes. Thus dl sufficiently large 
numbers are representable by 8 or fewer. The evidence indeed indicates 
that only 15 other numbers, of which the largest is 454, require SO many 
cubes as 8, and that 7 suffice from 455 onwards. 

It is plain, if this be so, that 9 is not the number which is really most 
significant in the problem. The facts that just two numbers require 9 
cubes, and, if it is a fact, that just 15 more require 8, are, SO to say, 
arithmetical flukes, depending on comparatively trivial idiosyncrasies 
of special numbers. The most fundamental and most difficult problem 
is that of deciding, not how many cubes are required for the representa- 
tion of all numbers, but how many are required for the representation 
of all large numbers, i.e. of all numbers with some finite number of 
exceptions. 

We define G(k) as the least value of s for which it is true that all 
sufficiently large numbers, i.e, all numbers with at most a finite number 
ofexceptions, are representable by s kth powers. Thus G(3) < 8. On 
the other hand, as we shall see in the next chapter, G(3) > 4; there are 
infinitely many numbers not representable by three cubes. Thus G(3) 
is 4, 5, 6, 7, or 8; it is still not known which. 

It is plain that G(k) < g(k) 
for every k. In general, G(k) is much smaller than g(k), the value of 
g(k) being’swollen by the difficulty of representing certain comparatively 
small numbers. 
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20.2. Squares. In this chapter we confine ourselves to the case 
k = 2. Our main theorem is Theorem 369, which, combined with the 
trivial resultt that no number of the form 8m-+7 can be the sum of 
three squares, shows that 

g(2) = G(2) = 4. 

We give three proofs of this fundamental theorem. The first (§ 20.5) 
is elementary and depends on the ‘method of descent’, due in principle 
to Fermat. The second (§ 20.6-g) depends on the arithmetic of quater- 
nions. The third (§ 20.11-12) depends on an identity which belongs 
properly to the theory of elliptic functions (though we prove it by 
elementary algebra),{ and gives a formula for the number of repre- 
sentations. 

But before we do this we return for a time to the problem of the 
representation of a number by two squares. wa 

THEOREM 366. A number n is the gum of two squares Gand oniy i) 
all prime factors of n of the form 4m-+ 3 have even exponents iim the standard 
form of n. 

This theorem is an immediate consequence of (16.9.5) and Theorem 
278. There are, however, other proofs of Theorem 366, independent of 
the arithmetic of k(i), which involve interesting and important ideas. 


20.3. Second proof of Theorem 366. We have to prove that n is 
of the form of 2?-++y? if and only if 


(20.3.1) n = Nine, 
where n, has no prime factors of the form 4m-+3. 
We say that n= a+y? 


is a primitive representation of n if (x, y) = 1, and otherwise an im- 
primitive representation. 


THEOREM 367. If p = 4m+3 and p n, then n has no primitive 
representations. 


If n has a primitive representation, then 
PI (x?+y?), (x, y) = 1, 
and so p fx, p {y. Hence, by Theorem 57, there is a number / such 
that y = lx (modp) and so 
2(1+l?) = x+y? = 0 (modp). 


t See § 20.10. $ See the footnote to p. 281. 
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It follows that 1+? = 0 (modp) 
and therefore that -1 is a quadratic residue of p, which contradicts 
Theorem 82. 

Tueorem 368. If p = 4m+3, p° |n, pt! fn, and c is odd, then n has 
no representations (primitive or imprimitive). 

Suppose that n = 27+ -y?, (x, y) = d; and let py be the highest power 
of p which divides d. Then 

a= dX, y = dY, (X, Y) = 1, 
n= d(X?+Y?) = dN, 
say. The index of the highest power of p which divides N is c—2y, 
which is positive because ¢ is odd. Hence 
N= X?+¥%, (X,Y)=1, p|N; 

which contradicts Theorem 367. 

It remains to prove that n is representable when n is of the form 
(20.3.1), and it is plainly enough to prove n, representable. Also 

(ity +y?) = (1% 41 Yo)? + (1 Y2— 2Y), 

so that the product of two representable numbers is itself representable. 
Since 2 = 12+]? is representable, the problem is reduced to that of 
proving Theorem 251, i.e. of proving that if p = 4m-+ 1, then p is 
representable. 

Since — 1 is a quadratic residue of such a p, there is an | for which 


. P = -1 (modp). 
Taking n = [vp] in Theorem 36, we see that there are integers a and b 
such that Loa 1 
0<b< vp, 5a tap 
If we w-rite c= lb+pa, 
then lo] < vp, 0< Be? < 2p. 


But c = lb (modp), and so 
be = 67+ 176? = b14) = 0 (modp); 
and therefore b?+-c? = p. 


20.4. Third and fourth proofs of Theorem 366. (1) Another proof 
of Theorem 366, due (in principle at any rate) to Fermat, is based on 
the ‘method of descent’. To prove that p = 4m--1 is representable, 
we prove (i) that some multiple of p is representable, and (ii) that the 
least representable multiple of p must be p itself. The rest of the proof 
is the same. 
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By Theorem 86, there are numbers x, y such that 
(20.4.1) x+y? = mp, pi x, PXY 
and 0 < m < p. Let mp be the least value of m for which (20.4.1) 
is soluble, and write Mo for m in (20.4.1). If ma = 1, our theorem is 


proved. 
If my > 1, then 1 < my <p. Now my cannot divide both x and y, 
since this would involve 


ms | +y) —> mi| mp —> m |p. 
Hence we can choose c and d so that 
zı = L—CM, Yı = y—dm, 
|a| < imo [yi] < pmo xi+yi > 0, 
and therefore 


(20.4.2) 0 < atty? < 2(4m,)? < me. 
Now Ly? = a?+-y? = 0 (mod mo) 
or 

(20.4.3) ait+yt = MyM, 


where 0 < M, < Mo, by (20.4.2). Multiplying (20.4.3) by (20.4.1), with 
m = Mg We obtain 
mom, p = (HPH) = (r tyy) +y ry). 

But sz, +yy, = v(x—em)tyly—dim) = mX, 

yı —t y = #(y—dm)—y(r—cm) = My Y, 
where X = p-cx-dy, Y = cy—dz. Hence 

mp = X?+Y? (0 <m < m), 

which contradicts the definition of mọ. It follows that mg must be 1. 


(2) A fourth proof, ‘due to Grace, depends on the ideas of Ch. III. 
By Theorem 82, there is a number / for which 


+1 = 0 (modp). 
We consider the points (x, y) of the fundamental lattice A which satisfy 
y = lx (modp). 


These points define a lattice M.t It is easy to see that the proportion 
of points of A, in a large circle round the origin, which belong to M is 
asymptotically 1/p, and that the area of a fundamental parallelogram 
of M is therefore p. 


t We state the proof shortly, leaving some details to the reader, 
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Suppose that A or (&,7) is one of the points of M nearest to the 
origin, Then y = lé and so 
—& = PE = ly (modp), 
and therefore B or (— n, €) is also a point of M. There is no point of 
M inside the triangle OAB, and therefore none within the square with 
sides OA, OB. Hence this square is a fundamental parallelogram of M, 
and therefore its area is p. It follows that 


E+? = p. 

20.5. The four-square theorem. We pass now to the principal 
theorem of this chapter. 

Turorem 369 (LAGRANGE’S tueorem). Every positive integer 18 the 
sum of four squares. 

Since 
(20.5.1) (xj+ap+a3+2xg(yityst+y3t+yi) 

= (Lp Yp Hy Yo +g Ya tg Ya)? + (1 Yo— ta Yr + Xs Ya — Va Ys)” 
+ (2 Ya — 2a Yta Yo— t Ya)” + (21 Yata Yr + Hy Yg— T3 Yo)” 

the product of two representable numbers is itself representable. Also 
1 = 12+0?+0?+ 02, Hence Theorem 369 will follow from 

Tueorem 370. Any prime p is the sum of four squares. 

Our first proof proceeds on the game lines as the proof of Theorem 366 
in § 20.4 (1). Since 2 = 12-+1?+0?+-02, we can take p > 2. 

It follows from Theorem 87 that there is a multiple of p, say mp, 
such that 
ee mp = witae+ asta, 
with 2, £a, Xa, x, not all divisible by p; and we have to prove that the 
least such multiple of p is p itself. 

Let mp be the least such multiple. If mọ = 1, there is nothing more 
to prove; we suppose therefore that m) > 1. By Theorem 87, Mọ < p. 

If mọ is even, then 2,+2%,+2%3+-a, is even and so either (i) £1, Yq, Lz, Ty 
are all even, or (ii) they are all odd, or (iii) two are even and two are 
6dd. In the last case, let us suppose that x, x, are even and %3, %4 
are odd. Then in all three cases 

titta %y—Xy, Bagtly Uy—~My 

are all even, and SO 


imp = (52) (y+ y+ Fh) 


2 2 2 
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is the sum of four integral squares. These squares are not all divisible 
by p, since 2, #2, £z, %4 are not all divisible by p. But this contradicts 
our definition of mọ. Hence mo must be odd. 
Next, £i, Xa, Xg, X4 are not all divisible by mp, since this would imply 
ms| mop > m |p, 
which is impossible. Also mọ is odd, and therefore at least 3. We can 
therefore choose b4, ba, bs, 6, SO that 


Y = x£i—bi Mo G = 1, 2, 3, 4) 


satis fy ysl < $m, yityityityi > 0. 
Then 0 < yityitystyi < Aim) = mj, 
and Yityitys+yi = 0 (mod mp). 


It follows that 
C+ ag +a3t+aq = Mp (M< p), 
Vit yEt ys = Mom, (0< mM < Mm); 
and so, by (20.5.1), 
(20.5.2) mm p= z+-2f+22-+2%, 
where 2,, Z% 23, Z4 are the four numbers which occur on the right-hand 
aide of (20.51). But 


2 = J z; y = D> t(x; —bi m) = X x} = 0 (mod m); 
and similarly 2», Z4, 24 are divisible by my. We may therefore write 


Zi = Mgt, G = 1, 2, 3, 4); 


and then (20.52) becomes 
mp = H+8+8+i, 


which contradicta the definition of m, because m, < Mp. 
It follows that m, = 1. 


20.6. Quaternions. In Ch. XV we deduced Theorem 251 from 
the arithmetic of the Gaussian integers, a subclass of the complex 
numbers of ordinary analysis. There is a proof of Theorem 370 based 
on ideas which are similar, but more sophisticated because we use 
numbers which do not obey all the laws of ordinary algebra. 

Quaternionst are ‘hyper-complex’ numbers of a special kind. The 
numbers of the system are of the form 


(20.6.1) = Ay +a, i, +g tg +d tg, 


} We take the elements of the algebra of quaternions for granted. A reader who 
knows nothing of quaternions, but accepts what is stated here, will be able to follow 
§§ 20.7-9. 
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where a,, a,, @p, 23 are real numbers (the coordinutes of «), and ti, tg ts 
elements characteristic of the system. Two quaternions are equal if 
their coordinates are equal. 

These numbers are combined according to rules which resemble those 
of ordinary algebra in all respects but one, There are, as in ordinary 
algebra, operations of addition and multiplication. The laws of addition 
are the same as in ordinary algebra; thus 


AFB = (otai ta ty +g tg) + (By tb, i1 +5 t2 +65 ts) 
(ao tbo) + (dy +54)t, + (aot bo)tat (ag -+bg)t5. 
Multiplication is associative and distributive, but not generally com- 


mutative. It is commutative for the coordinates, and between the 
coordinates and i,, i,, tą; but 


== So) 
(20.6.2) {. apa, PRL ae ee T 
la bg = t1 = gle, lgti = hg E —lhi 2g, titg = tg = hgt 
Generally, 
= Cot citit Ca ia-+ Ca tg, 
where 


Cy = Agb)—a, b, —a,6,—a5 bz 
(20.6.4) Cy =- Ayb,+a, by +a, b,;—ds by, 

Ca = Agb,—A, by +a2 bo +055, 

C3 = Agbs+-a, b,—d2b,+a3bp. 
In particular, 
(20.6.5) (apt8 ty +09 19+05 %g)(@y—O, 1, Gy 1g —Ohg tg) 

= ahah alta, 

the coefficients of 71, ta, t in the product being zero. 

We shall say that the quaternion œ is integral if a, a, a, a are either 
(i) all rational integers or (ii) all halves of odd rational integers. We 
are interested only in integral quaternions; and henceforth we use 
‘quaternion’ to mean ‘integral quaternion’. We shall use Greek letters 
for quaternions, except that, when a, = a) = @ = 0 and SO a= a,,, we 
shall use a, both for the quaternion 


ay +0.i 40.140. i; 
and for the rational integer a,. 
The quaternion 


(20.6.6) & = ay— i,t — atr —Ag 1g 
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is called the conjugate of w = dy+0,1,+4t,+G,%s, and 

(20.6.7) Na = o& = ao = af+-af+af+aj 

the norm of œ. The norm of an integral quaternion is a rational integer. 


We shall say that g is odd or even according as Na is odd or even. 
It follows from (20.6.3), (20.6.4), and (20.6.6) that 


aß = Ba, 
and so 


(20.6.8) N(aB) = oB.of = of .Ba = «.NB.a = aa. NB = NaNg. 
We define a-t, when « Æ 0, by 


(20.6.9) ese. 
* = Na 
so that 
(20.6.10) aal = ale = 1. 


If q and a~! are both integral, then we say that ais a unity, and write 
a = €, Since ec! = 1, MeNe! = 1 and so NE = 1. Conversely, if « is 
integral and Na: = 1, then a~! = & is also integral, so that a is a unity. 
Thus a unity may be defined alternatively as an integral quaternion 
whose norm is 1. 

If a,,, a,, Gg, Gg are all integral, and a2+a?+a?+a? = 1, then one of 
a?,... must be 1 and the rest 0. If they are all halves of odd integers, 
then each of az,... must be 4. Hence there are just 24 unities, viz. 
(20.6.11) +1, ti, tis tis (+ltitisti). 


If we write 


(20.6.12) p = $(1+%,+7,+%3), 
then any integral quaternion may be expressed in the form 
(20.6.13) kop+kiir tkis +k tz 


where ko, ki, ko, k, are rational integers; and any quaternion of this 
form is integral. It is plain that the sum of any two integral quaternions 
is integral. Also, after (20.6.3) and (20.6.4), 


P H—l+ittati) = p-l, 

pi, = $(—1+7,+%,—%3) = —p+i tiz 

typ = 3—14+j—t241%3) = —p+i tis 
with similar expressions for pi„ etc. Hence all these products are integral, 
and therefore the product of any two integral quaternions is integral. 


If ¢ is any unity, then ea and qe are said to be associates of «, Asso- 


ciates have equal norms, and the associates of an integral quaternion 
are integral. 
6691 x 


2 
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If y = af, then y is said to have a as a left-hand divisor and B as 
a right-hand divisor. If « = a, or B = b,, then af = Bau and the ais- 
tinction of right and left is unnecessary. 


20.7. Preliminary theorems about integral quaternions. Our 
second proof of Theorem 370 is similar in principle to that of Theorem 
25 1 contained in §§ 12.8 and 15.1. We need some preliminary theorems. 

TueoreM 371. If œ is an integral quaternion, then one at least of its 
associates has integral coordinates; and if ~is odd, then one at least of 
its associates hag non-integral coordinates. 

(1) If the coordinates of « itself are not integral, then we can choose 
the signs so that 

a = (batb, itb to +bsia) +(+ ltiittzti) = By, 
say, where bg, b1, bz, b are even. Any associate of $ has integral coordi- 
nateq, and yy, an associate of y, is 1. Hence ay, an associate of a, has 
integral coordinates. 


(2) If x is odd, and has integral coordinates, then 
a = (by t-b, 41+, tgs %5)+ (Cot ey ty tcia t Ezi) = By, 
say, where bo, bı, bz, b3 are even, each of Co, Cis Cg, Cz is O or 1, and 
(since Na is odd) either one is 1 or three are. Any associate of B has 
integral coordinates. It is therefore sufficient to prove that each of the 
quaternions 


1 ? 14, to, ig, 1+t.+15, lit iz lita, titia + is 
has an associate with non-integral coordinates, and this is easily verified. 
Thus, if y = 7,, then yp has non-integral coordinates. If 


y = litis = (1+iiti ti) —i = Ath 
or y = tititi = (1ti,+%g+%3)—1 = agi 
then Ae = A.4(1—t,—t,—73) = 2 
and the coordinates of pe are non-integral. 
THEOREM 372. If k is an integral quaternion, and m a positive integer, 
then there is an integral quaternion À such that 
N(k—ma) < m. 
The case m = 1 is trivial, and we may suppose m > 1. We use the 
form (20.6.13) of an integral quaternion, and write 
K= kop+kitit katt kst, À = lapt iiH tg +l tg 


where ky plo pe are integers. The coordinates of k—mÀ are 
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a(ky—mly), {ky + 2k, —m(Ip+ 21,)}, Hot 2k,—m(ly+ 2lz)}, 
Hot 2k, —m(Iy+ 2ls)}. 
We can choose Zo, l, lz, 1; in succession SO that these have absolute 
values not exceeding }m, 4m, 4m, 4m; and then 
N(k—md) S pmt 3. qm < me. 
Torm 373. lf yand 8 are integral quaternions, and 8 - 0, then 
there are integral quaternions à and y such that 
a=A8+y, Ny< NB. 
We take k = a, m = BR- NB, 
and determine À as in Theorem 372. Then 
(a—AB)B= K—Am= Kk—mAa, 
N(a—Ag)NB = N(x—mA)< me, 
Ny = N(a—Ag) < m = NB. 


20.8. The highest common right-hand divisor of two quater- 
nions. We shall say that two integral quaternions œ and 8 have a 
highest common right-hand divisor § if (i) 6 is a right-hand divisor of g 
and 8, and (ii) every right-hand divisor of q and 8 is a right-hand divisor 
of ô; and we shall prove that any two integral quaternions, not both 0, 
have a highest common right-hand divisor which is effectively unique. 
We could use Theorem 373 for the construction of a Euclidean algo- 
rithm’ similar to those of § 12.3 and 12.8, but it is simpler to use ideas 
like those of § 2.9 and 15.7. 

We call a system S of integral quaternions, one of which is not 0, 
a right-ideal if it has the properties 

@ «eS. peS mates, 

Gi) a E S — Aw ES for all integral quaternions À: 


the latter property corresponds to the characteristic property of the 
ideals of § 15.7. If 8 is any integral quaternion, and S is the set (A8) 
of all left-hand multiples of § by integral quaternions A, then it is plain 
that S is a right-ideal. We cal] such a right-ideal a principal right-ideal. 

Teorem 374. Every right-ideal ¿s a principal right-ideal. 

Among the members of S, not 0, there are some with minimum norm: 
we call one of these 6. If y € S, Ny < N8, then y = 0. 

If x e S then a—Aéd € S, for every integral A, by (i) and Gi). By 
Theorem 373, we can choose À SO that Ny = N(a—A8) < N8. But then 
y = 0, a= A$, and so S is the principal right-ideal (A), 
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We can now prove 


THEOREM 375. Any two integral quaternions «œ and B, not both 0, have 
a highest common right-hand divisor 8, which is unique except for a left- 
hand unit factor, and can be expressed in the form 


(20.8.1) ô = pa+vB, 
where p and y are integral. 


The set S of all quaternions pa+vf is plainly a right-ideal which, 
by Theorem 374, is the principal right-ideal formed by all integral 
multiples A$ of a certain ð, Since S includes §,S can be expressed in 
the form (20.8.1). Since S includes g and £, S is a common right-hand 
divisor of y and £; and any such divisor is a right-hand divisor of every 
member of S, and therefore of 8. Hence S is a highest common right- 
hand divisor of g and £. 

Finally, if both S and 38’ satisfy the conditions, 6’ = Aé and S = A’d’," 
where À and ’ are integral. Hence S = )’A6, 1 = A’A, and A and À’ are 
unities. 

If S is a unity è , then all highest common right-hand divisors of œ and 
B are unities. In this case 

watr = e, 


for some integral p’, v’; and 


(u'a (e“")B = 1; 


so that 

(20.8.2) ptf = 1 
for some integral pu, Y. We then write 
(20.8.3) CHM 1. 


We could of course establish a similar theory of the highest common 
left-hand divisor. 
If g and ĝ have a common right-hand divisor 6, not a unity, then 


Naand N£ have the common right-hand divisor NS > 1. There is one 
important case in which the converse is true. 


THEOREM 376. If x is integral and B = m, a positive rational integer, 
then a necessary and sufficient condition that («,B8), = 1 is that 
(Na, NB) = 1, or (what is the same thing) that (Na,m) = 1. 


For if (a, 8), = 1 then (20.8.2) is true for appropriate p, x. Hence 
N(ya) = N(1—vB) = (1—mv)(1—mi), 
NuNa = 1—mv—mi+m Ny, 
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and (Na, m) divides every term in this equation except 1. Hence 
(Na, m) = 1. Since NB = m*, the two forms of the condition. are 
equivalent 


20.9. Prime quaternions and the proof of Theorem 370. An 
integral quaternion 7, not a unity, is said to be prime if its only divisors 
are the unities and its associates, i.e. if 7 = a8 implies that either g or 
Bis a unity. It is plain that all associates of a prime are prime. If 
nm = aß, then Nr = NaN, so that ~ is certainly prime if Nz is a 
rational prime. We shall prove that the converse is also true. 

Torm 377. An integral quaternion vn is prime if and only if its 
norm Na is a rational prime. 


Since Np = p*, a particular case of Theorem 377 is 

Tuzorem 378. A rational prime p cannot be a prime quaternion. 

We begin by proving Theorem 378 (which is all that we shall actually 
need). 

Since 2 = (1+12,)(1—i,), 
2 is not a prime quaternion. We may therefore suppose p odd. 

By Theorem 87, there are integers 7 and s such that 

0<r<p, 0<s<p, 1+r2+s? = 0 (modp). 
If a = 1+581,—r1, 
then Na = 1+r?-+s? = 0 (modp), 
and (Na,p) > 1. It follows, by Theorem 376, that «and p have a 
common right-hand divisor ô which is not a unity. If 
a= 6,4, P = 5,4, 
then ô, is not a unity; for if it were then 6 would be an associate of p, 
in which case p would divide all the coordinates of 
a= 5,5= 3,8; tp, 

and in particular 1. Hence p = 8,4, where neither 6 nor 6, is a unity, 
and SO p is not prime. 

To complete the proof of Theorem 377, suppose that 7 is prime and 
p a rational prime divisor of Nz. By Theorem 376, + and p have a 
common right-hand divisor 7’ which is not a unity. Since 7 is prime, 
7m is an associate of 7 and Nz’ = Nvr. Also p= An’, where À is 
integral; and p? = NAN7’ = NANz, so that NÀ is 1 or p. If NA were 
1, p would be an associate of 7’ and z,and SO a prime quaternion, 
which we have seen to be impossible. Hence Nz = p,a rational prime. 
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It is now easy to prove Theorem 370. If p is any rational prime, 
p = Aw, where NA = Ny = p. If 7 has integral coordinates a,, a,, a,, 


ag oes p= Na= ab+attat+ay 


If not then, by Theorem 371, there is an associate 7’ of m which has 
integral coordinates. Since 

p = Nr = Nvr, 
the conclusion follows as before. 

The analysis of the preceding sections may be developed sO as to 
lead to a complete theory of the factorization of integral quaternions 
and of the representation of rational integers by sums of four squares. 
In particular it leads to formulae for the number of representations, 
analogous to those of §§ 16.9-10. We shall prove these formulae by a 
different method in § 20.12, and shall not pursue the arithmetic of 
quaternions further here. There is however one other interesting 
theorem which is an immediate consequence of our analysis. If we 
suppose p odd, and select an associate 7’ of 7 whose coordinates are 
halves of odd integers (as we may by Theorem 371), then 


P= Na= Na’ = (bot) t (bj +5) + (bot 3)+ bt), 
where 0,,... are integers, and 
4p = (2by+1)?+-(26,+1)?+ (262+ 1)?+ (263+ 1). 
Hence we obtain 


Torm 379. If p 18 an odd prime, then 4p is the sum of four odd 
integral squares. 


Thus 4.3 = 12 = 17+12+ 121 3? (but 4.2 = 8 is not the sum of four 
odd integral squares). 


20.10. The values of g(2) and G(2). Theorem 369 shows that 
G2) < 92) <4. 


On the other hand, 

(2m)? = 0 (mod 4), (2m-+-1)? = 1 (mod 8), 
so that x? = 0, 1, or 4 (mod 8) 
and e+ty?+z? Æ 7 (mod 8). 


Hence no number 8m-+7 is representable by three squares, and we 
obtain 


THEOREM 380: g(2) = G) = 4. 
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If a?+-y?+2? = 0 (mod 4), then all of x, y, z are even, and 
Fety) = GrP Hay +H 
is representable by three squares. It follows that no number 4%(8m-+7) 
is the sum of three squares. It can be proved that any number not of 
this form is the sum of three squares, so that 
n Æ 4(8m+7) 
is a necessary and sufficient condition for n to be representable by three 


squares; but the proof depends upon the theory of ternary quadratic 
forms and cannot be included here. 


20.11. Lemmas for the third proof of Theorem 369. Our third 
proof of Theorem 369 is of a quite different kind and, although 
‘elementary’, belongs properly to the theory of elliptic functions. 

The coefficient r.,(n) of x” in 


(1+2e+2a4+..) = ( > am?) 
‘m= — o 
is the number of solutions of 
n= mitm mgng 
in rational integers, solutions differing only in the sign or order of the 
m being reckoned as distinct. We have to prove that this coefficient 


is positive for every n. 
By Theorem 312 


4 2 so 
(1424-24...) = Hrt) 


and we proceed to find a transformation of the square of the right-hand 
side. 

In what follows x is any number, real or complex, for which |x| < 1. 
The series which we use, whether simple or multiple, are absolutely 
convergent for |%|< 1. The rearrangements to which we subject them 
are all justified by the theorem that any absolutely convergent series, 
simple or multiple, may be summed in any manner we please. 


; ar 
We write ak a 
a 
so that ae = u,(1+4,). 
We require two preliminary lemmas. 
THEOREM 381: $ Um(1 +n) = 2 ithe 


THE REPRESENTATION OF A NUMBER BY [Chap. XX 


312 
For 
fe 9) œ 
— = nar? — n an —. 
2 (1—a™)? p> a RA >> am 1—az" 


THEOREM 382: 
a 1)" t99(1 + Wem) = 2 0n peers 


For 
ba ym- 1y2m œ 
(—1 m-l yggmr 
SG (1 ar = 2 ) 2, 
mal 
© 
= Sr > (— 1)” — >, a 
r=1 m=i = lta? 
2 / rar a 2, (2n—1)arin-2 
> 1— z?" a ~ lain" 
r=1 n=1 


20.12. Third proof of Theorem 369: the number of repre- 
sentations. We begin by proving an identity more general than the 


actual one we need. 
THEOREM 383. If @ is real and not an even multiple of n, and if 


L= L(xz,0) = cot 46+ 4, sin 0+ u, sin 26+..., 

T, = T,(x,6) = ({cot $6)?+ u,(14-u,)cos 6-+-u,(1-+-u,)cos 26+ ..., 

T, = Tr(a, e) = ${u,(1—cos 6) + 2u,(1—cos 26) + 3u,(1—cos 30)+...}, 
=+ 


then 
We have 


00 R h) 
P= [zeot 40 Èun sin no) 
(4 cot 46)?+4 >) Uy, Cot 36 sinnô -+ $ A up sin MÊ sin nO 
n=1 m=in=1 


= (} cot 49)2+.8,+8,, 


say. We now use the identities 


1cot48sinnð = 4-+cos8+cos 20+...4+cos(n—1)8+-} cos nð, 
2sin mésin nd = cos(m—n)0—cos(m-+n)8, 


which give 


8, = ¥ u,{}+ cos 6+ cos 26+... cos(n—1)0-+-4 cos nð}, 


SS uy ug{eos(m—n)0—cos(m-+n)B}. 
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and L? = (feot40)°+0,+ Š O, cos kð, 
k=1 


say, on rearranging S, and S, as series of cosines of multiples of 8.Ẹ 


We consider Qy first. This coefficient includes a contribution 4 > u, 
1 


from §,, and a contribution 4 $ u3 from the terms of S, for which 


m =n. Hence 


by Theorem 381. 
Now suppose k > 0. Then S, contributes 
Wirt Un = Mat D Ury 
n=k+1 l=1 
to Cp, while 8, contributes 
D Unt A 2 wm vn A 2. pom “ns 


where m > 1, n > 1 in each summation. Hence 


œ o k-1 
Cy, . 4u u Urupa — E > Uug 
k = Pet È Mart È tates $ È ti 
The reader will easily verify that 
Wg = Uglu t tr) 


and Ury t Uk =- Ugl Ugy) 
Hence 


= ujit 2 Ši U— tku) — +S (Lutu) 
= T E (k—1)— (u+ tat... Hup) 


= u,(1+u,—4h), 
and 80 


I? = (cot 20 +} Š nu, + > U,(1-+-u,—4k)cos kd 


(4 cot $6)?+ > u,{1+u,)cos k0+4 2 ku,,(1—cos k6) 
= Ty (0,8) +T, 8). 
t To justify this rearrangement we have to prove that 
foe) 
E [ual (t+ eos 8| +..+ leos neJ) 
a= 
© foe) 
and E E lumlleeq|(leos(m-+n)8| + |eos(m —n)4]) 

m=1 n= 

are convergent. But this is an immediete consequence of the absolute convergence of 


ao © fo) 
E Np XS Ug, 


n=1 m=1n=1 
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THEOREM 384 : 
(FH 0 —Ug+ ts — trt...) 
= tilu t 2u t 3u; +U; + 6u tH TU, + 9ug+...), 
where in the last series there are no terms in Us, Ug, Urge 
We put f = 47 in Theorem 383. Then we have 


T,- k È (Vila (1 an) 


T,- 3 2 (2m — Vom +2 2 (2m—1)ttam—2 


m= 


Now, by Theorem 382, 
= a > (2m —1)Ugy-2, 


m=1 
and 80 TAT, = Hilut 2u,4 3ug+5u,+...). 
From Theorems 312 and 384 we deduce 
THEOREM 385: 
(14-244 2a44 2294 ...)4 = 148 >’ muy, 
where m runs through all positive integral values which are not multiples of 4. 


Finally, 
8 Y mun =8>' =Y mam = 8 Š cpa”, 
where c= Zm 


mi|n,4{m 

is the sum of the divisors of n which are not multiples of 4. 

It is plain that c, > 0 for all n > 0, and so r,(n) > 0. This provides 
us with another proof of Theorem 369; and we have also proved 

TuEorEm 386. The number of representations of a positive integer n as 
the sum of four squares, representations which differ only in order or sign 
being counted q8 distinct, is 8 times the sum of the divisors of n which are 
not multiple8 of 4. 


20.13. Representations by a larger number of squares. There 
are similar formulae for the numbers of representations of n by 6 or 8 


squares. Thus re(n) - 16 5 x(d')d?—4 > x(d)d? 

din din 
where dd' = n and x(d), as in § 16.9, is 1, -1, or 0 according as d is 
4k+1, 4k—1, or 2k; and 


r(n) = 16(—1)" ¥ (—1)4a3, 
din 
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These formulae are the arithmetical equivalents of the identities 

(1+ 24+ 24+...) 

Pr 2242 3223 Pr 3223 5x5 
a os) 


ipa?t Tat 1+<26 2g 1-2! l= 


4 EN Le 23x2 3323 
and (142x4214 ...)8 = T P24 pat] 
These identities also can be proved in an elementary manner, but have 
their roots in the theory of the elliptic modular functions. That r,(n) 
and r,(n) are positive for all n is trivial after Theorem 369. 

The formulae for r,(n), where s = 10,12,...,involve other arithmetical 
functions of a more recondite type. Thus r,,(n) involves sums of powers 
of the complex divisors of n. 

The corresponding problems for representations of n by sums of an 
odd number of squares are more difficult, as may be inferred from 
§ 20.10. When s is 3, 5, or 7 the number of representations is expressible 


=1416( 


as a finite sum involving the symbol Tof Legendre and Jacobi, 
0 


NOTES ON CHAPTER XX 


§ 20.1. Waring made his assertion in Meditationes algebraicae (1770), 204-5, 
and Lagrange proved that g(2) = 4 later in the same year. There is an exhaustive 
account of the history of the four-square theorem in Dickson, H istory, ii, ch. viii. 

Hilbert’s proof of the existence of g(k) for every was pubiished in Göttinger 
Nachrichten (1909), 17-36, and Math. Annalen, 67 (1909), 281-305. Previous 
writers had proved its existence when k = 3, 4, 5, 6, 7, 8, and 10, but its value 
had been determined only for k = 3. The value of g(k) is now known for all k 
except 4 and 5: that of G(k) for k = 2 and k = 4 only. The determinations of 
g(k) rest on a previous determination of an upper bound for G(k). 

See also Dickson, History, ii, ch. 25, and our notes on Ch. XXI. 

Lord Saltoun drew my attention to an 8fTOTr on p. 298. 

§ 20.3. This proof is due to Hermite, Journal de math. (1), 13 (1848), 15 (Euvres, 
i. 264). 

§ 20.4. The fourth proof is due to Grace, Journal London Math. Soc. 2 (1927),. 
8-8. Grace also gives a proof of Theorem 369 based on simple properties of four- 
dimensional lattices. 

§ 20.5. Bachet enunciated Theorem 369 in 1621, though he did not profess to 
have proved it. The proof in this section is subatantially Euler’s. 

§§ 20.6-g. These sections are based on Hurwitz, Vorlesungen über die Zahlen- 
theorie der Quaternionen (Berlin, 1919). Hurwitz develops the theory in much 
greater detail, and uses it to find the formulae of § 20.12. We go so far only as 
is necessary for the proof of Theorem 370; we do not, for example, prove any 
general theorem concerning uniqueness of factorization. There is another account 
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of Hurwitz’s theory, with generalizations, in Dickson, Algebren und ihre Zahlen- 
theorie (Zürich, 1927), ch. 9. 

The first arithmetic of quaternions wag constructed by Lipschitz, Unter- 
suchungen über die Summen von Quadraten, Bonn, 1886. Lipschitz defines an 
integral quaternion in the most obvious manner, viz. as ọnẹ with integral coordi- 
nates, but his theory js much more complicated than Hurwitz’s. Later, Dickson 
[Proc. London Math. Soc. (2) 20 (1922), 225-32] worked out an alternative and 
much simpler theory based on Lipschitz’s definition, We followed this theory 
in our first edition, but it is less satisfactory than Hurwitz’s: it is not true, for 
example, in Dickson’s theory, that any two integral quaternions have a highest 
common right-hand divisor. 

§ 20.10. The ‘ three-square theorem’, which we do not prove, is due to Legendre, 
Essai gur la théorie deg nombres (1798), 202, 398-9, and Gauss, D.A., § 291. Gauss 
determined the number of representations. See Landau, Vorlesungen, i. 114-25. 
There is another proof, depending on the methods of Liouville, referred to in the 
note on § 20.13 below, in Uspensky and Heaslet, 465—74. 

§§ 20.11-12. Ramanujan, Collected papers, 138 et seq. 

§ 20.13. The results for 6 and 8 squares are due to Jacobi, and are contained 
implicitly in the formulae of §§ 40-42 of the Fundamenta nova, They are stated 
explicitly in Smith’s Report on the theory of numbers (Collected papers, i. 306-7). 
Liouville gave formulae for 12 and 10 squares in the Journal de math. (2) 9 (1864), 
296-8, and 11 (1866), 1-8. Glaisher, Proc, London Math. Soc. (2) 5 (1907), 479-90, 
gave a systematic table of formulae for To(n) up to 2s = 18, based on previous 
work published in vols. 36—39 of the Quarterly Journal of Math. The formulae 
for 14 and 18 squares contain functions defined only as the coefficients in certain 
modular functions and not arithmetically. Ramanujan (Collected papers, no. 18) 
continues Glaisher’s table up to 2s = 24. 

Boulyguine, in 1914, found general formulae for r,,(n) in which every function 
which occurs has an arithmetical definition. Thus the formula for faln) contains 
functions > P(Ly, Lores Ti) where ¢ is a polynomial, ¢ has one of the values 2s— 8, 
2s— 16,..., and the summation is over all solutions of xitaz+ ... +2} = n, There 
are references to Boulyguine’s work in Dickson’s H astory, ii. 317. 

Uspensky developed the elementary methods which seem to have been used 
by Liouville in a series of papers published in Russian: reference’ will be found 
in a later paper in Trans. Amer. Math. Soc. 30 (1928), 385-404. He carries his 
analysis up to 2s = 12, and states that his methods enable him to prove Bouly- 
guine’s general formulae. 

A more analytic method, applicable also to representations by an odd number 
of squares, has been developed by Hardy, Mordell, and Ramanujan. See Hardy, 
Trans. Amer. Math. Soc. 21 (1920), 255-84, and Ramanujan, ch. 9; Mordell, 
Quarterly Journal of Math. 48 (1920), 93-104, and Trans. Camb. Phil, Soc. 22 
(1923), 361-72; Estermann, Acta arithmetica, 2 (1936), 47-79; and nos. 18 and 
21 of Ramanujan’s Collected papers. 

Wc defined Legendre’s symbol in § 6.5. Jacobi’s generalization is defined in 
the more systematic treatises, e.g. in Landau, Vorlesungen, i. 47. 


XXI 
REPRESENTATION BY CUBES AND HIGHER POWERS 


21.1. Biquadrates. We defined ‘ Waring’s problem’ in § 20.1 as the 
problem of determining g(k) and G(k), and solved it completely when 
k = 2. The general problem is much more difficult, Even the proof 
of the existence of g(k) and G(k) requires quite elaborate analysis; and 
the value of G(k) is not known for any k but 2 and 4. We give a sum- 
mary of the present state of knowledge at the end of the chapter, but 
we shall prove only a few special theorems, and these usually not the 
best of their kind that are known. 


It is easy to prove the existence of g(4). 
‘THEOREM 387. g(4) exists, and does not exceed 50. 
The proof depends on Theorem 369 and the identity 
(21.1.1) 6(a?+62+c?+d?)? = (a+6)*+ (a—b)*+ (c+d)*4 (c—d)4 
+(a+e)*+ (a—c)*+ (6+d)*-+ (b—d)4 
+-(a-+d)+ (a—d)+ (b-+e)4+-(b—o). 
We denote by B, a number which is the gum of s or fewer biquadrates. 
Thus (21.1.1) shows that 


6(a? +240? 4d)? = Bus, 
and therefore, after Theorem 369, that 


(21.1.2) 622 = By 
for every zg, 
Now any positive integer n is of the form 
n = 6N+r, 


where N > 0 and ris 0, 1, 2, 3, 4, or 5. Hence (again by Theorem 369) 
n = Bit ah-+agtap)+r; 
and therefore, by (21.1.2), 
n= BtB tBetBetr = Bytr = Bss 


(since r is expressible by at most 5 1’s). Hence g(4) exists and is at 
most 53. 


It is easy to improve this result a little. Any n > 81 is expressible as 
n= 6N+41, 


where N > O,andt = 0, 1, 2, 81, 16,or 17,accordingasn = 0, 1, 2, 3, 4, 
or 5 (mod6). But 


1=14 2214414 81234 16=24 175 24414, 
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Hence t = B, and therefore 
n= Byt Ba = Bso, 
so that any n > 81 is Be 
On the other hand it is easily verified that n = Bry if1 <n < 80. 


Edectonly 79 = 4.24415, 14 
requires 19 biquadrates. 


21.2. Cubes : the existence of G( 3) and g(3). The proof of the 
existence of g(3) is more sophisticated (as iş natural because a cube 
may be negative). We prove first 


THEOREM 388: G(3) < 13. 


We denote by C, a number which is the sum of s non-negative cubes. 


We suppose that z runs through the values 7, 13, 19,... congruent to 
1 (mod 6), and that J, is the interval 
elz) = 1129+ (z8+-1)84 12528 < n < 142 = yhe). 
It is plain that ¢(z+6) < (z) for large z, so that the intervals J, 
ultimately overlap, and every large ņ lies in some J,. It is therefore 
sufficient to prove that every n of J, is the sum of 13 non-negative cubes. 
We prove that any n of J, can be expressed in the form 


(21.2.1) n = N+8z9+ 6mz3, 
where 

(21.2.2) N= 6, O<m<2, 
We shall then have m = g?r e+ 2%, 


where 0 < 2, < 23; and so 
n = N482°4-629(23- 03+ 08+ 2) 
4 
N+ $ (+H) 
C+C = Cis 
It remains to prove (21.2.1). We define r, s, and N by 
n = 6r (modz?) (1 <r <z’), 
n = 8+4 (mod6) (0 <s <5), 
N = (r+1)84 (r—1)84 2(28—r)8-+ (82). 
Then N = 0; and 


0 <N < (8413843294 12523 = $(z)—829 < n— 82, 
so that 


(21.2.3) 828 < n-N < 142%, 
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Now N = (r+1)4+(r—1—2r? = 6r =n = n— 8? (modz'). 
Also x? = x (mod 6) for every 2, and so 
N = r+i+r—142(—r) +82 = 228+82 
= (2+8)2 = 2+8 = n—2 
= n-8 = n—82* (mod6). 
Hence n-N-89 is a multiple of 628. This proves (21.2.1), and the 
inequality in (21.2.2) follows from (21.2.3). 
The existence of g(3) is a corollary of Theorem 388. It is however 


interesting to show that the bound for G(3) stated in the theorem is 
also a bound for g(3). 


21.3. A bound for (3). We must begin by proving a sharpened 
form of Theorem 388, with a definite limit beyond which all numbers 
are Cig 

THEOREM 389. Jf n > 10%, then n = Cyg 

We prove first that ot (z+6) < f(z) if z > 373, or that 

1129+ (8+ 1)?+ 1258 < 14(t—6)*, 
i.e. 

1 
(21.3.1) u(i -f ie > W+5 ste 5 
if t > 379. Now (1—8)" > so 
8 54 

if 0 < ô< 1. Hence (1-4) eG 
if { > 6; and SO (21.3.1) is satisfied if 


128 
11-7) > sto +5 


3 128 1 


Zat p tE 


This is clearly true if {> 7.54+ 1 = 379. 

It follows that the intervals I overlap from z = 373 onwards, and n 
certainly lies in an J, if n > 14(373)%, 
which is less than 1035, 

We have now to consider representations of numbers less than 105, 
It is known from tables that, all numbers up to 40000 are Cy, and that, 
among these numbers, only 23 and 239 require as many cubes as 9. 


Hence 
n=G (<n < 239), n = Ca (240 <n < 40000). 


or if 2(¢—7.54) > 
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Next, if N >1 and m = [Nt], we have 
N—m3 = (N*—m? < 3N1(Nt—m) < 3Nt. 
Now let us suppose that 
240 <n <10% 
and put n= 240+N, 0 <N < 10%, 
Then 
N = m+N, m = [M], 0 < M < 3X}, 
N, = m§+N,, m = Ni, 0< NM, < 3M}, 
N = mg+N;, m = Ni, 0 <M < 3N 
Hence 
(21.3.2) n = 2404+N = 240+ N,+m3+m3+m3+m3+m}. 
Here 
0< N <3N} S 33NI S... 
< 3,3% 34" 30 3@* NGF 


NNG — 97(LOP\ 8) 
à 27(5) z 21( S| < 35000. 


Hence 240 < 240+-N, < 35240 < 40000, 
and so 240+WN, is C,; and therefore, by (21.3.2), n is C,3. Hence all 
positive integers are sums of 13° cubes. 

THEOREM 390: g(3) < 13. 

The true value of g(3) is 9, but the proof of this demands Legendre’s 
theorem ($20.10) on the representation of numbers by sums of three 
squares. We have not proved this theorem and are compelled to use 


Theorem 369 instead, and it is this which accounts for the imperfection 
of our result. 


21.4. Higher powers. In § 21.1 we used the identity (21.1.1) to 
deduce the existence of g(4) from that of g(2). There are similar identi- 
ties which enable us to deduce the existence of g(6) and g(8) from that 
of g(3) and g(4). Thus 
(21.4.1) 60(a2+b?+c2+d2)3 = $ (aLb+c)§+2 X (aLb)s+36 > a’. 
On the right there are 

16+2.12+36.4 = 184 
sixth powers. Now any n is of the form 


60N-+r (0 < r < 59); 
9(3) g(3) 

and 60N = 60 > X? = 60 > (a?+b?+-c2?+-43)8, 
t=1 i=l 


21.4 (391-3)] HIGHER POWERS 321 
which, by (21.4.1), is the sum of LSH sixth powers. Hence 7 is the 


Re 184g(3)+r < 184g(3) +59 
sixth powers; and so, by Theorem 390, 

THEOREM 39 1: g(6) < 1849(3)+59 < 2451. 

Again, the identity 
(21.4.2) 5040(a?-++b2+c?+d?)4 

= 6 > (2a)8+-60 > (a+b)8+ > (2a+b+c)®+6 > (a+b+c+d)§ 

has 6.4+60.12+48+6.8 = 840 
eighth powers on its right-hand aide. Hence, as above, any number 
5040N is the sum of 8409(4) eighth powers. Now any number up to 


5039 is the sum of at most 273 eighth powers of 1 or 2.t Hence, by 
Theorem 3887, 


THEOREM 392 : (8) < 840g(4)+273 < 42273. 
The results of Theorems 391 and 392 are, numerically, very poor; and 


the theorems are really interesting only as existence theorems. It is 
known that g(6) == 73 and that g(8) = 279. 


21.5. A lower bound for g(k). We have found upper bounds for 
g(k), and a fortiori for G(k), for k = 3, 4, 6, and 8, but they are a good 
deal larger than those given by deeper methods. There is also the 
problem of finding lower bounds, and here elementary methods are 
relatively much more effective. It is indeéd quite easy to prove all 
that is known at present. 

We begin with g(k). Let us write q = [(3)*]. The number 

n = 2kq —] < 3% 
can only be represented by the powers 1* and 2%, In fact 
Mm = (g—1)2k4 (21 —1)1*, 
and sO n requires just 
g—-1+2k—1 = 2k+q—2 
kth powers. Hence 

THEOREM 393 : g(k) > 2*+q—2. 

In particular g(2) > 4, g(3) > 9, g(4) È 19, g(5) È 37,.... It is known 
that g(k) = 2*+q—2 for all values of k up to 400 except perhaps 4 and 
5, and it is quite likely that this is true for every k. 


+ The worst number is 4863 = 18. 28-255. ]8, 
5591 Y 
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21.6. Lower bounds for G(k). Passing to G(k), we prove first a 
general theorem for every k, 
THEOREM 394: G(k) > k+1 fork > 2. 


Let A(N) be the number of numbers n < N which are representable 
in the form 


(21.6.1) n = pakt... Hak, 


where x, > 0. We may suppose the x, arranged in ascending order of 
magnitude, so that 


(21.6.2) 0S Sr S... Lt NIK, 
Hence A(N) does not exceed the number of solutions of the inequalities 
(21.6.2), which is 
[NUE] zk Tk EFI 
B(N) = Doanh k 
T=] Tk- =l y= 0 y.=0 
The summation with respect to 2, gives 7 +1, that with respect to £a 


gives 5 (za+1) = (za +1)(z3+2) 


? 
z120 2! 


that with respect to 2, gives 


Ş tat? (+ Dirt 2+3) 
21 By ? 


z3=0 
and SO on: SO that 
Lak N 
(21.6.3) BN) =z | |W" ~ a 
r=l1 
for large N. 


On the other hand, if G(k) < k, all but a finite number of n are 
representable in the form (21.6. 1), and 


A(N) > N-C, 
where C is independent of N. Hence 


N-C<A(N) <S BW) ~ gp 
which is plainly impossible when k > 1. It follows that G(k) > k. 

Theorem 394 gives the best known universal lower bound for G(k). 
There are arguments based on congruences which give equivalent, or 
better, results for special forms of k. Thus 


x? = 0, 1, or -1 (mod 9), 


and so at least 4 cubes are required to represent a number N = 9m4 4. 
This proves that G(3) > 4, a special case of Theorem 394. 
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Again 

(21.6.4) zt = 0 or 1 (mod 16), 

and so all numbers 16m-+-15 require at least 15 biquadrates. It follows 

that G(4) > 15. This is a much better result than that given by 

Theorem 394, and we can improve it slightly. 


It follows from (21.6.4) that, if 16n is the sum of 15 or fewer biquad- 
rates, each of these biquadrates must be a multiple of 16. Hence 


15 15 
16n = È z} = > (2y;)* 
i=1 i=1 
15 
and so n= $y 


Hence, if 16n is the sum of 15 or fewer biquadrates, SO is n. But 31 is 
not the sum of 15 or fewer biquadrates; and so 16”, 31 is not, for any m. 
Hence 

THEOREM 395: G(4) > 16. 

More generally 

THEOREM 396: G(28) > 2+2 if f > 2. 


The case § = 2 has been dealt with already. If 0 > 2, then 

k= > 642, 

Hence, if x is even, x = 0 (mod 29+2), 

while if x is odd then 

r? = (142m)? = 14.294 2941(2— 1m 

= 1—29+14m(m—1) = 1 (mod 2+), 

Thus 

(21.6.5) a?” = 0 or 1 (mod 26+), 

Now let n be any odd number and suppose that 99+2n is the sum of 
94+2_] or fewer kth powers. Then each of these powers must be even, 
by (21.6.5), and so divisible by 2*. Hence 2-8-2 n, and ‘so n is even; 
a contradiction which proves Theorem 396. 

It will be observed that the last stage in the proof fails for @ = 2, 
when a special device is needed. 

There are three more theorems which, when they are applicable, give 
better results than Theorem 394. 

THEOREM 397. Ifp > 2 and 6 > 0, then G{p%(p—1)} > pitt, 

For example, G(6) > 9. 

If k = pf(p— 1), then 64+. 1 < 3° < k. Hence 


gk =0 (mod p?+1) 
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ifp |x. On the other hand, if p f x, we have 
ak = xP*-) = | (mod p91) 
by Theorem 72. Hence, if p?+1n, where p } n, is the sum of pitt 1 or 
fewer kth powers, each of these powers must be divisible by p? + and so 
by p*. Hence p* p'n, which is impossible; and therefore G(k) > p, 
THEOREM 398. If p > 2 and § > 0, then Git p*(p 1)} > 1(p9+t 1). 
For example, G(10) > 12. 
It is plain that 


k = }p’(p—1) > p> 0+1, 
except in the trivial case p = 3, 6 = 0, k = 1. Hence 
ak = 0 (mod p+!) 
ifp |x. On the other hand, if p / x, then 
gk — xP") = 1 (mod p+) 
by Theorem 72. Hence p*+! | (x?*—1), i.e. 
pit) (xk—1)(ak+ D. 
Since p > 2, p cannot divide both x*— 1 and w*-+ 1, and so one of a*¥—1 
and x¥+1 is divisible by ®t, It follows that 
ak = 0, lor —1 (mod p+!) 
for every x; and therefore that numbers of the form 
pm Mp1) 
require at least 3(p*+!—1)kth powers. 

Tueorem 399. If 0 > 2,+ then G(3.29) > 29+2, 

This is a trivial corollary of Theorem 396, since G(3. 2%) > G(2%) > 20+2, 

We may sum up the results of this section in the following theorem. 

THEOREM 400. G(k) has the lower bounds 

(i) 29+2 if k is 2° or 3.28 and 0 > 2; 
(ii) p+ if p > 2 andk = p(p—l); 

(iii) 4(p?+1—1) if p>2 andk = 4p%(p—1); 

(iv) k+l in any case. 

These are the best known lower bounds for G(k). It is easily verified 
that none of them exceeds 4k, SO that the lower bounds for G(k) are 
much smaller, for large k, than the lower bound for g(k) assigned by 
Theorem 393. The value of gk) is, as we remarked in § 20.1, inflated by 
the difficulty of representing certain comparatively small numbers. 


+ The theorem is true for § = 0 and 9 = 1, but is then jncluded in Theorems 394 
and 397. 
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It is to be observed that k may be of several of the special forms 
mentioned in Theorem 400. Thus 


6 = 3(3—1) = 7-1 = 4(13—1), 
so that 6 is expressible in two ways in the form (ii) and in one in the 
form (iii). The lower bounds assigned by the theorem are 
32 = 9, Pos 7, 4(13—1) = 6, 6+1 = 7; 


and the first gives the strongest result. 
21.7. Sums affected with signs: the number v(k). It is ‘also 


natural to consider the representation of an integer n as the sum of 
s members of the set 


(21.7.1) 0, 1%, 2 ,..., —1*, —24, —3F ,..., 
or in the form 
(21.7.2) n = tattokt.tak, 


We use u(k) to denote the least value of s for which every n is repre- 
sentable in this manner. 

The problem is in most ways more tractable than Waring’s problem, 
but the solution is in one way still more incomplete. The value of g(k) 
is known for many k, while that of v(k) has not been found for any k 
but 2. The main difficulty here lies in the determination of a lower 
bound for v(k); there is no theorem corresponding effectively to Theorem 
393 or even to Theorem 394. 


Torm 401: vu(k) exists for every k. 


It is obvious that, if g(k) exists, then v(k) exists and does not exceed 
g(k). But the direct proof of the existence of w(k) is very much easier 
than that of the existence of g(k). 

We require a lemma. 


THEOREM 402: 

k—1 k—1 

S Caml jet = ketd, 
r 

r=0 


where d is an integer independent of x. 


The reader familiar with the elements of the calculus of finite 
differences will at once recognize this as a well-known property of the 
(k—1)th difference of x*, It is plain that, if 


Qul) = Aya... 
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is a polynomial of degree k, then 

AQ, (x) = Qy(a+1)—Q,(@) = kA, xt +..,, 

A?Q,(x) k(k- 1)4 t-24... 


FQ (a) =k! Apz+d, 
where d is independent of x. The lemma is the case Q, (x) = xk, In 
fact d = 4(k—1)(k!), but we make no use of this. 
It follows at once from the lemma that any number of the form 
k! «+d is expressible as the sum of 


S i Saka 
r 


r=0 
numbers of the set (21.7.1); and 
n-d = kx, —h(k!)< L < F(R!) 
for any n and appropriate 1 and x. Thus 
n = (k!a+d)+, 
and n is the sum of Qk-l4] < Qk-14 Lik!) 
numbers of the set (21.7.1). 
We have thus proved more than Theorem 401, viz. 
THEOREM 403: v(k) < 24-14 3(kl). 


21.8. Upper bounds for v(k). The upper bound in Theorem 403 is 
generally much too large. 

It is plain, as we observed in § 21.7, that v(k) < g(k). We can also 
find an upper bound for v(k) if we have one for G(k). For any number 
from a certain N(k) onwards is the sum of G(k) positive kth powers, 


as nyt > Nik) 
Gk) 
for some y, SO that n= > xk¥—yk 
1 
and 
(21.8.1) v(k) < G(k)+1. 


This is usually a much better bound than g(k). 

The bound of Theorem 403 can also be improved substantially by 
more elementary methods. Here we consider only special values of k 
for which such elementary arguments give bounds better than (21.8.1), 

(1) Squares. Theorem 403 gives v(2) < 3, which also follows from 
the identities 2x41 = (x41) —z? 


and Qu = g?—(x— 1)?+ L, 
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On the other hand, 6 cannot be expressed by two squares, since it 
is not the sum of two, and 2?—y? = (x—y)(x-+y) is either odd or a 
multiple of 4. 


THEOREM 404 : v(2) = 3. 
(2) Cubes. Since 
n—n = (n- In(n+ 1) =O (mod 6) 
for any n, we have 
n = n—6g = n3—(x+41)3—(x—1)84 223 

for any n and some integral x. Hence v(3) < 5. 

On the other hand, 

y? = 0, 1, or -1 (mod 9); 

and so numbers 9m+4 require at least 4 cubes. Hence v(3) > 4. 

THEOREM 408: v(3) 719 4 or 5. 

It is not known whether 4 or 5 is the correct value of v(3). The 
identity 6x = (e+1)9+(4—1)3— 2x8 
shows that every multiple of 6 is representable by 4 cubes. Richmond 


and Mordell have given many similar identities applying to other 
arithmetical progressions. Thus the identity 


6e+3 = 28—(x—4)8+4 (22—5)8—(2a4—4)8 


shows that any odd multiple of 3 is representable by 4 cubes. 
(3) Biquadrates. By Theorem 402, we have 


(21.8.2) (w+ 3)*—3(a+ 2)4+3(a+1)!—a2t = 24r+d 


(where d = 36). The residues of 04, 14, 34, 24 (mod 24) are 0, 1, 9, 16 
respectively, and we can easily verify that every residue (mod24) is 
the sum of 4 at most of 0, + 1, +9, + 16. We express this by saying 
that 0, 1, 9, 16 are fourth power residues (mod 24), and that any residue 
(mod 24) is representable by 4 of these fourth power residues. Now 
we can express any n in the form n = 24x+d+r, where 0 <r < 24; 
and (21.8.2) then shows that any n is representable by 8+4 = 12 
numbers +y*, Hence v(4) < 12. On the other hand the only fourth 
power residues (mod 16) are 0 and 1, and SO a number 16m- 8 cannot 
be represented by 8 numbers +y unless they are all odd and of the 
same sign. Since there are numbers of this form, e.g. 24, which are 
not sums of 8 biquadrates, it follows that v(4) > 9. 
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THEOREM 406: 9<v(4) < 12. 

(4) Fifth powers. In this case Theorem 402 does not lead to the best 
result; we use instead the identity 
(21.8.3) 

(x43 — 22+ 2)5+r54(x—1)} —2(x—3)} +(x—4) = 720x-360. 
A little calculation shows that every residue (mod 720) can be repre- 
sented by two fifth power residues. Hence v(5) < 8+2 = 10. 

The only fifth power residues (mod 11) are 0, 1, and -1, and so 
numbers of the form 1llm+ 5 require at least 5 fifth powers. 

THEOREM 407 : 5 < v(5) < 10. 


21.9. The problem of Prouhet and Tarry: the number P(k, j). 
There is another curious problem which has some connexion with that 
of § 21.8 (though we do not develop this connexion here). 

Suppose that the a and b are integers and that 


S= Sa) = ajtot. tat = 5 ad; 
and consider the system of k equations 
(21.9.1) Nla) = S% <h <b). 
It is plain that these equations are satisfied when the b are a permuta- 
tion of the a; such a solution we call a trivial solution. 

It is easy to prove that there are no other solutions when s < k. It 
is sufficient to consider the case s = k. Then 

bitbat. top OF+...+02,. > bE+.. +0 
have the same values as the same functions of the a, and therefore 
the elementary symmetric functions 
> b; 5 b;b;, bie b; b9...0;, 
have the same values as the same functions of the a. Hence the a 
and the b are the roots of the same algebraic equation, and the b are 
a permutation of the a. 

When s > k there may be non-trivial solutions, and we denote by 
P(k, 2) the least value of s for which this is true. It is plain first (since 
there are no non-trivial solutions when s < k) that 
(21.9.2) P(k, 2) > kH. 

We may generalize our problem a little. Let us take j > 2, write 

Si, = utut.. Haa 


t By Newton’s relations between the coefficients of an equation and the gums of the 
powers of its mots. 
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and consider the set of k(j-1) equations 

(21.9.3) Sir = Spe = + = Sry (1 Shk) 

A non-trivial solution of (21.9.3) is one in which no two sets @;u 
(1 <i <s)anda, (1 <1 <s) with u Æ v are permutations of one 
another. We write P(k,j) for the least value of s for which there is a 
non-trivial solution. Clearly a non-trivial solution of (21.9.3) for j È 2 
includes a non-trivial solution of (21.9.1) for the same s. Hence, by 
(21.9.2), 


‘THEOREM 408. P(k,j) 3 Pk, 2) 3 k+ l. 
In the other direction, we prove that 
THEorREM 409: P(k,j) < $k(k+1)+1. 


Write s = }4(k-+1)+] and suppose that n > s! s*j. Consider all the 
sets of integers 
(21.9.4) Oy, Aayer Qg 
for which 1<a,<n (1 <r <8). 
There are ns such sets. 
Since 1 <a, < n, we have 
&<S,(a) < sn". 


Hence there are at most 


k 
sn#—s+1) < skn? kk+1) — gkys-l 

U 
different sets 
(21.95) Sla), Sala), Sla). 
Now S! j. Ene- ns, 
and so at least s!7 of the sets (21.9.4) have the same set (21.9.5). But 
the number of permutations of s things, like or unlike, is at most s!, 
and so there are at least j sets (21.9.4), no two of which are permuta- 
tions of one another and which have the same set (21.9.5). These 
provide a non-trivial solution of the equations (21.9.3) with 


s= $k(k-+1)+1. 


21.10. Evaluation of P(k,ġj) for particular k and j. We prove 
Turoreém 410. P(k,j) = k+l for k = 2, 3, and 5 and all j. 


By Theorem 408, we have only to prove that P(k,j) < k+1 and for 
this it is sufficient to construct actual solutions of (21.9.3) for any 
given j. 
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By Theorem 337, for any fixed j, there is ann such that 
n=c{+di=ch+dj=... =ci+d} 
where all the numbers 
Cy, Cg pe Cj, A, gerry Oy 
are positive and no two are equal. If we put 
aly = Cw Ogu = Fy, Ogy = êm Agy = iy 
it follows that 
Siu = 0, Sou = 2n, Szu = 0 (l&u j) 
and SO we have a non-trivial solution of (21.9.3) for k = 3, s = 4. 
Hence P(3,j) <4 and so P(3,7) = 4. 
For k = 2 and k = 5, we use the properties of the quadratic field 
k(p) found in Chapters XIII and XV. By Theorem 255, m = 3+ and 
m= 3+p% are conjugate primes with m= 7. They are not associates, 


since 2 
T_T _ 9+6ptp?_ 8,8 
Baa tze 


7) ni 7 7 
which is not an integer and so, a fortiori, not a unity. Now let u > 0 
and let pt A Bip 
where A,, B, are rational integers. If 7 | A,, we have 
m#|Ay n]Ao 7] Byp 
in k(p), and Nr| Bi, 7|B2, 7B, 
in k(1). Finally 7 |n", na| nèt, A) n?e, a |a 
in k(p), which is false. Hence 7 ,/‘A, and, similarly, 7 | B,. 
If we write C, = 7-*4,, dy = TI-¥B,, 
we have 
c2+c,d,+d2 = N(c,—d,,p) = T-N r" = 7, 
Hence, if we put a,,, = Cys Gy, = Fy, 43, = — (Cut du), we have S, = 0 
and, = ch-+-d2-+(c,-+d,)? = A4 e, d 4d?) = 2.7%, 
Since at least two of (a,,,,@2,,@3,) are divisible by 7/-“ but not by 
7i-“+1, no set is a permutation of any other set and we have a non- 
trivial solution of (21.9.3) with k = 2 ands = 3. Thus P(2,9) = 
For k = 5, we write 
Qiu = Cw Gay = Ay, Ügy = Culu My = liu 


asu = Ugu: Ty = zu 
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and have Siu = Ssu = Ssu = 9, Sou = 4.7%, 

Sy, = Yetti + (eu tdu} = 4(c2+ce,d,+d2)? = 4.78, 
As before, we have no trivial solutions and so P(5, j) = 6. 

The fact that, in the last solution for example, S,,, =S;, =u — 0 
does not make the solution so special as appears at first sight. For, if 
aru = Ary (I <r <s, 1 <u <j) 

is one solution of (21.9.3), it can easily be verified that, for any d, 
ap = Apy td 
is another such solution. Thus we can readily obtain solutions in which 
none of the S is zero. 
The case j = 2 can be handled successfully by methods of little use 
for larger j. If a,, ag p., a,, bi pee; Og is a solution of (21.9.1), then 


(21.10.1) ¥ {(a,-+d)+08} = > {a+ (b;+4)} (1 < h < k+1) 
1 i=1 


for every d. For we may reduce these to 


hol ip hol ip 
5 (7): = 5 (S0 2 <h <k+) 
i=1 izi 
and these follow at once from (21.9.1). 

We choose d to be the number which occurs most frequently as a 
difference between two a or two b. We are then able to remove a good 
many terms which occur on both sides of the identity (21.10.1). 


We write la, Asle = [045.555], 
to denote that S, (a) = S,(b) for 1 Sh Sk. 
Then [0, 3], = [1, 2]. 


Using (21.10.1), with d = 3, we get 
[1,2, 3, 6]. = [0, 3, 4, 5], 


or [1, 2, 6], = [0, 4, 5). 
Starting from the last equation and taking d = 5 in (21.10.1), we 
sae [0, 4, 7, 11}, = [L 2, 9, 10}. 


From this we deduce in succession 
[1, 2, 10, 14, 18], = [0, 4, 8, 16, 17], (d = 7), 
l0, 4, 9, 17, 22, 26]; = [1, 2, 12, 14, 24, 25]; (d = 8), 
[1, 2, 12, 13, 24, 30, 35, 39], = [0, 4, 9, 15, 26, 27, 37, 38], (d= 13), 
[0, 4, 9, 23, 27, 41, 46, 50], = [1, 2, 11, 20, 30, 39, 48, 49], (d= 11). 
Hence P(k, 2) << k+ 1 for k <6 and for k = 7. 
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The examplet 
[0, 18, 27, 58, 64, 89, 101], = [1, 13, 38, 44, 75, 84, 102],, 
shows that P(k,2) < k+ 1 for k = 6; and these results, with Theorem 
408, give 
THEOREM 411. If k <7, P(k, 2) = k+l. 


21.11. Further problems of Diophantine analysis. We end this 
chapter by a few unsystematic remarks about a number of Diophantine 
equations which are suggested by Fermat’s problem of Ch. XIII. 

(1) A conjecture of Euler. Can a kth power be the sum of 8 positive 
kth powers ? Is 
(21.11.1) attak+..tak = yk 
soluble in positive integers ? ‘Fermat’s last theorem’ asserts the im- 
possibility of the equation when s = 2 and k > 2, and Euler extended 
the conjecture to the values 3, 4,..., k-l of s. For k = 5, s = 4, how- 
ever, the conjecture is false, since 


2754 8451 1105+ 1335 = 1445 
The equation 
(21.11.2) wk oko. ak = yk 


has also attracted much attention. The case k = 2 is familiar.{ When 
k = 3 we can derive solutions from the analysis of § 13.7. If we put 
= 1 and a = -3b in (13.7.8), and then write —}q for b, we obtain 


(21.11.38) x = 1—9, y = -1, u = —9q4, v = 99q4— 39; 
and So, by (13.7.2), 
(99t) + (3g — 9} + (199°)? = 1. 
If we now replace q by é/ņ and multiply by 7!*, we obtain the identity 
(21.1.4) (9€4)8 + (3&3 — 9€4)8 + (4498)? = (74). 
All the cubes are positive if 
0<é< 9g, 
ł This may be proved by starting with 
[1, 8, 12, 15, 20, 23, 27, 34], = [0, 7, 11, 17, 18, 24, 28, 353, 


and taking d = 7, 11, 13, 17, 19 in succession. 
ł See § 13.2. 
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so that any twelfth power q! can be expressed as a sum of three positive 
cubes in at least [9-*y] ways. 

When & > 3, little is known. A few particular solutions of (21.11.2) 
are known for k = 4, the smallest of which is 
(21.115) 304+ 1204427244.3154 = 3534+ 
For k = 5 there are an infinity included in the identity 
(2.11.6) (75y5—a5)5-+ (5+ 25y5)5-+ (w5—25y5)5-4 (10x4?) + (50xy4)5 

= (x+ T5y5), 

All the powers are positive if 0 < 25y5 < 2° < 75y*. No solution is 


known with k > 6. 
(2) Equal sums of two kth powers. Is 


Bit city = i+ 
soluble in positive integers ? More generally, is 
AES) wttyt = atya = oo = tHe 


soluble for given k and r ? 

The answers are affirmative when k = 2, since, by Theorem 337, we 
can choose n SO as to make r(n) as large as we please. We shall now 
prove that they are also affirmative when k = 3. 


THEOREM 412. Whatever r, there are numbers which are representable 
as sums of two positive cubes in at least r different ways. 


We use two identities, viz. 


(21.11.9) XB— V3 = rity 
if 
3 2 3 2 3 3 
(21.11.10) X= aeti, = weary), 
3 i— yi 1 Yi 

an 
(21.11.11) atye = XY’ 
if 

X(X3-2Y 3) Y(2X3— Y?) 
(21.11.12) t= crs cane ol a -EF ' 


+ The identity (4at— yt) + 2(4ar3y)44 2(2ary3)4 = (4a4+y4)4 
gives an infinity of biquadrates expressible as gums of 5 biquadrates (with two equal 
pairs) ; and the identity 
(z*@—y?)*+ (2xy +y?) + (2ey+a%)t = 2(n?-+ay+y?) 
gives an infinity of solutions of 


siteite = yi+yi 
(all with y, = Ya). 
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Each identity is an obvious corollary of the other, and either may be 
deduced from the formulae of § 13.7.t From (21.11.9) and (21.11.11) 
it follows that 


(21.11.13) BHY = Hyh 


Here z,, Y are rational if x}, Yı are rational. 

Suppose now that r is given, that x, and y, are rational and positive 
and that A 
is large, Then X, Y are positive, and X/ Y is nearly 2,/2y,; and %2, Y2 are 
positive and z,/y, is nearly X/2Y or x,/4y,. 

Starting now with Za, Y in place of æ, Yı and repeating the argument, 
we obtain a third pair of rationals 23, ys such that 


+y? = ety} = 23+y3 
and %3/y3 is nearly x,/4*y,. After r applications of the argument we 
obtain 
(21.11.14) ait+yf = Hyi. s+ ys, 
all the numbers involved being positive rationals, and 
a 422 4228, > 4r-1%r 
Y Y2 Y3 Yr 
all being nearly equal, so that the ratios æ fy, (¢ = 1,2,..., r) are certainly 
unequal. If we multiply (21.11.14) by 6, where | is the least common 
multiple of the denominators of 2,, Yy, Vp Yp We obtain an integral 
solution of the system (21.11.14). 
Solutions of eit+yi = rity 
can be deduced from the formulae (13.7.11); but no solution of 
tity = +ryi= 13+93 
is known. And no solution of (21.117) is known for k > 5. 
Swinnerton-Dyer has found a parametric solution of 


(21.11.15) Haia = yetyitys 
which yields solutions in positive integers. A numerical solution is 
(21.11.16) 495+ 755+ 1075 == 395+-925-+ 1005. 


t If we put q = b and À = 1 in (13.7.8), we obtsin 
x = 8+1, y = 16a3—1, u = 4a—l6a4, v = 2a416a!; 
and if we replace u by 4¢, and use (13.7.2), we obtain 


(q°— 2g)? + (2g°— 1)? = (g¢-+-9)*— (+1), 
an identity equivalent to (21.11.11). 
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The smallest result of this kind for sixth powers is 
(21.11.17) 3641964226 — 1084 1564 238, 


NOTES ON CHAF’TER XXI 

A great deal of work has been done on Waring’s problem during the last fifty 
years, and it may be worth while to give a short summary of the results. We 
have already referred to Waring’s original statement, to Hilbert’s proof of the 
existence of g(k), and to the proof that g(3) = 9 [Wieferich, Math. Annalen, 66 
(1909), 99-101, corrected by Kempner, ibid. 72 (1912), 387-97]. 

Landau [ibid. 66 (1909), 102-5] proved that G( 3) < 8 and it was not until 1942 
that Linnik [Comptes Rendus (Doklady) Acad. Se. USSR, 35 (1942), 162] 
announced a proof that G(3) < 7. Dickson /Bull. Amer. Math. Soc. 45 (1939) 
588-91] showed that 8 cubes suffice for all but 23 and 239. See G. L. Watson, Math. 
Gazette, 37 (1953), 209-11, for a simple proof that G(3) < 8 and Journ, London 
Math. Soc. 26 (1951), 153-6 for one that G(3) < 7 and for further references, After 
Theorem 394, G(3) > 4, so that G(3) is 4, 5, 6, or 7; it is still uncertain which, 
though the evidence of tables points very strongly to 4 or 5. See Western, ibid. 
1 (1926), 244-50. 

Hardy and Littlewood, in a series of papers under the general title ‘Some 
problems of partitio numerorum’, published between 1920 and 1928, developed 
a new analytic method for the study of Waring’s problem. They found upper 
bounds for G(k) for any k, the first being 


(k—2)2k-1 5, 
and the second a more complicated function of k which is asymptotic to kok- 
for large k. In particular they proved that 
(a) G4 < 19, GS) < 41, G6 < 87, G(7) < 193, G(8) < 425. 
Their method did not lead to any new result for G(3); but they proved that 
‘almost all’ numbers are sums of 5 cubes. 

Davenport, Acta Math. 71 (1939), 123-43, has proved that almost all are sums 
of 4. Since numbers 9m+4 require at least 4 cubes, this ig the final result. 

Hardy and Littlewood also found an asymptotic formula for the number of 
representations for ”% by § kth powers, by means of the so-called ‘singular series’, A4 
Thus r,.,,(n), the number of representations of n by 21 biquadrates, is approxi- A 

iZ 


mately 

ri) 

1) 
(the Jater terms of the series being smaller). There is a detailed account of all 
this work (exoept on its ‘numerical’ side) in Landau, Vorlesungen, i. 235-339. 
As regards g(k), the best results known, up to 1933, for small k, were 
g(4) < 37, 9(5) < 58, g(6) < 478, = 9(7) < 3806, gs8) < 31353 

(due to Wieferich, Baer, Baer, Wieferich, and Kempner respectively). All these 
had been found by elementary methods similar to those used in §§ 21.1-4. The 
results of Hardy and Littlewood made it theoretically possible to find an upper 
bound for g(k) for any k, though the calculations required for comparatively 
large k would have been impracticable. James, however, in a paper published 
in Tram. Amer. Math. Soc. 36 (1934), 395-444, succeeded in proving that 


(b) g(6) < 183, g(7) < 322, g(8) < 595. 
He also found bounds for g(9) and g( 10). 


n {1 + 1-331 cos( $nm-+24m) +0379 cos(fnm— $m) +...} Ye 
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The more recent work of Vinogradov has made it possible to obtain much 
more satisfactory results. Vinogradov’s earlier researches on Waring’s problem 
had been published in 1924, and there is an account of his method in Landau, 
Vorlesungen, i. 340-58. The method then used by Vinogradov resembled that 
of Hardy and Littlewood in principle, but led more rapidly to some of their 
results and in particular to a comparatively simple proof of Hilbert’s theorem. 
It could also be used to find an upper bound for g(k), and in particular to prove 
that gk) 

ko k2k-1 ~ 
In his later work Vinogradov made very important improvements, based primarily 
on a new and powerful method for the estimation of certain trigonometrical sums, 
and obtained results which are, for large k, far better than any known before. 
Thus he proved that 


(c) G(k) < 6klogk+(4+log216)k; 


so that G(k) is at most of order klog k. Vinogradov’s proof was afterwards 
simplified considerably by Heilbronn [Acta arithmetica, 1 (1936), 212-21], who 
improved (c) to 


(d) Gk) < bklogk+ {4+8log(347)|k+3. 


It follows from (d) that 
G(4) < 67, G6) < 89, G6) < 113, G(7) < 137, G(8) < 163. 


These inequalities are inferior to (a) for k = 4, 5, or 6; but better when k > 6 
(and naturally far better for large values of k). 


More has been proved since concerning the cases k = 4, 5, and 6: in particular, 
the value of G(4) is now known. Davenport and Heilbronn [Proc. London Math. 
Xoc. (2) 41 (1936), 143-50] and Estermann (ibid. 126-42) proved independently 
that G(4) < 17. Finally Davenport [Annals of Math. 40 (1939), 731-47] proved 
that G(4) < 16, so that, after Theorem 395, G(4) = 16; and that any number 
not congruent to 14 or 15 (mod 16) is a sum of 14 biquadrates. He also proved 
[Amer. Journal of Math. 64 (1942), 199-207] that G(5) < 23 and G(6) < 36: 
Hua had proved that G(5) < 28, and Estermann [Acta arithmetica, 2 (1937), 
197-211] a result of which G(6) < 42 is a particular case. 

It was conjectured by Hardy and Littlewood that 

G(k) < 2k+1, 
except when k = 2" and m > 1, when G(k) = 4k; but the truth or falsity of 
these conjectures is still undecided, except for k = 2 and k = 4. 

Vinogradov’s work has also led to very remarkable results concerning g(k). 
If we know that G(k) does not exceed some upper bound d(k), So that numbers 
greater than C(k) are represontable by G(k) or fewer kth powers, then the way 
is open to the determination of an upper bound for g(k). For we have only to 
study the representation of numbers up to C(k), and this is logically, for a given 
k, a question of computation. It was thus that James determined the bounds 
set out in (b); but the results of such work, before Vinogradov’s, were inevitably 
unsatisfactory, since the bounds (a) for G(k) found by Hardy and Littlewood 
are (except for quite small values of k) much too large, and in particular larger 
than the lower bounds for g(k) given by Theorem 393. 
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Li glk) = 2* +G] 2 
is the lower bound for g(k) assigned by Theorem 393, and if, for the moment, 
we take G(k) to be the upper bound for G(k) assigned by (d), then g(k) is ofmuch 
higher order of magnitude than G(k). Theorem 393 gives 7 


g(4) >19, g(5) >37, g(6) >73, g(7) > 143, g(8) > 279; 
and gk) > G(k) for k > 7. Thus if k > 7, if all numbers from C(k) on are 
representable by G(k) powers, and all numbers below C(k) by g(k) powers, then 


g(k) = glk). 
And it is not necessary to determine the C(k) corresponding to this particular 
G(k); it is sufficient to know the C(k) corresponding to any G(k) < Mk), and in 
particular to C(k) = g{(k). 

This type of argument has led to an ‘almost complete’ solution of the original 
form of Waring’s problem. The first, and deepest, part of the solution rests on 
an adaptation of Vinogradov’s method. The second depends on an ingenious 
use of a ‘method of ascent’, a simple case of which appears in the proof, in § 21.3, 
of Theorem 390. 

Let us write 

A = [(})*], B = 3k_2k4, D = [(4)*]. 
The final result is that 


(e) gk) = 9+A-2 
for all k for which fk > 5 and 
(f) B <M A-2, 


In this case the value of g(k) is fixed by the number 

n = kA—l= (A—1)9%+(2k-1), 14 
used in the proof of Theorem 393, a comparatively small number representable 
only by powers of 1 and 2. The condition (f) is satisfied for 4 < k < 200000 
[Stemmler, Math. Computation 18 (1964), 144-6] and may well be true for all 
k> 3. 

It is known that B #4 2kK—A—1 and that B + 2*—A (except for k = 1). 
If B > 2k—A+ 1, the formula for g(k) is different. In this case, 

gtk) = %4+44+D—3 if X < AD+A+D 
and g(k) = 2§4A4+D-—2 if 24 = AD+A+D. 
It is readily shown that 2* < AD+A + D. 

Most of these results were found independently by Dickson [Amer. Journal 
of Math. 58 (1936). 521-9, 530-5] and Pillai [Journal Indian Math. Soc. (2) 2 
(1936), 16-44, and Proc. Indian Acad. Sci. (A), 4 (1936), 261]. They were com- 
pleted by Pillai [ibid. 12 (1940), 30-40] who proved that g(6) = 73, by Rubu- 
gunday [Journal Indian Math. Soc, (2) 6 (1942), 192-8] who proved that 
B + 2k— A, by Niven [Amer. Journal of Math. 66 (1944), 137-43] who proved 
(e) when B = 2k*—A—2, a case previously unsolved, and by Jing-run Chen 
[Chinese Math.-Acta 6 (1965), 105-27] who proved that g(5) = 37. 

The solution is now complete except for k = 4, and for the uncertainty whether 
(f) can be false for any k. The best-known inequality for 4 is 


19 < g4) < 35: 


5591 z 
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the upper bound here is due to Dickson /Bull. American Math. Soc. 39 (1933), 
701-27}. l 

It will be observed that (except when k = 4) there is much more uncertainty 
about the value of C(k) than about that of g(k); the most striking case is k = 3. 
This is natural, since the value of G(k) depends on the deeper properties of the 
whole sequence of integers, and that of g(k) on the more trivial properties of 
special numbers near the beginning. 

$ 21.1. Liouville proved, in 1859, that g(4) < 53. This upper bound was im- 
proved gradually until Wieferich, in 1909, found the upper bound 37 (the best 
result arrived at by elementary methods). We have already referred to Dickson’s 
later proof that g(4) < 35. 

References to the older literature relevant to this and the next few sections 
will be found in Bachmann, Niedere Zahlentheorie, ii. 328-48, or Dickson, History, 
ii, ch. xxv. 

§§ 21.2-3. See the note on § 20.1 and the historical note which precedes, 

§ 21.4. The proof for g(6) is due to Fleck. Maillet proved the existence of g(8) 
by a more complicated identity than (21.4.2); the latter is due to Hurwitz. 
Schur found a similar proof for g( 10). 

§ 21.5. The special numbers 7% considered here were observed by Euler (and 
probably by Waring). 

§ 21.6. Theorem 394 is due to Maillet and Hurwitz, and Theorems 395 and 396 
to Kempner. The other lower bounds for G(k) were investigated systematically 
by Hardy and Littlewood, Proc, London Math. Soc. (2) 28 (1928), 618-42. 

§§ 21.7-8. For the results of these sections see Wright, Journal London Math. 
Soc. 9 (1934), 267-72, where further references are given; Mordell, ibid. 11 (1936), 
208-18; and Richmond, ibid. 12 (1937), 206. 

Hunter, Journal London Math. Soc. 16 (1941), 177-9 proved that 9 < w(4) < 10: 
we have incorporated in the text his simple proof that v(4) > 9. 

§§ 21.9-10. Prouhet [Comptes Rendus Paris, 33 (1851), 225] found the first non- 
trivial result in this problem. He gave a rule to separate the first get positive 
integers into j sets of je members, which provide a solution of (21.9.3) with g = 7%, 
For a simple proof of Prouhet’s rule, see Wright, Proc, Edinburgh Math. Soc. 
(2) 8 (1949), 138-42. See Dickson, History, ii, ch. xxiv, and Gloden and Palama, 
Bibliographie deg Multigrades (Luxembourg, 1948), for general references. 
Theorem 408 is due to Bastien [Sphinx-Oedipe 8 (1913), 171-2] and Theorem 409 
to Wright [Bull. American Math. Soc. 54 (1948), 755-7]. 

§ 21.10. Theorem 410 is due to Gloden [Mehrgradige Gleichungen, Groningen, 
1944, 71-90]. For Theorem 411, see Tarry, L’intermédiaire des mathématiciens, 
20 (1913), 68-70, and Escott, Quarterly Journal of Math. 41 (1910), 152. 

A. Létae found the examples 


[1, 25, 31, 84, 87, 134, 158, 182, 198], 


= [2, 18, 42, 66, 113, 116, 169, 175, 199], 
and 


(+12, 411881, 20231, +20885, +23738], 
(+436, £11857, 420449, + 20667, +23750],, 


which show that P(k, 2) = k+l for k = 8 and k = 9, See A. Létac, Gaz&a 
Matematica 48 (1942), 68-69, and A. Gloden, Joe, cit. 
§ 21.11. The most important result in this section is Theorem 412. The rela- 
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tions (21.11.9)-(21.11.12) are due to Vieta; they were used by Fermat to find 

solutions of (21.11.14) for any 7 (see Dickson, History, ii. 550-1). Fermat assumed 
without proof that all the pairs Lys Ya (8 = 1, 2,...,7) would be different. The first 
complete proof was found by Mordeil, but not published. 

Of the other identities and equations which we quote, (21.11.4) is due to 
Gérardin [L’intermédiaire de8 math. 19 (1912), 7] and the corollary to Mahler 
[Journal London Math. Soc. 11 (1936), 136-8], (21.11.6) to Sastry [ibid. 9 (1934), 
242-6], the parametric solution of (2 1.11.15) to Swinnerton-Dyer [Proc. Cambridge 
Phil, Soc. 48 (1952), 516-8], (21.11.16) to Moessner [Proc. Ind, Math. Soc. A 10 
(1939), 296-306], (21.11.17) to Subba Rao [Journal London Math. Soc.9 (1934), 
172-3], and (21 .11.5) to Norrie. Patterson found a further solution and Leech 6 
further solutions of (21.11.2) for = 4 [Bull. Amer. Math. Soc. 48 (1942), 736 and 
Proc, Cambridge Phil, Soc. 54 (1958), 554-5]. The identities quoted in the foot- 
note to p. 333 were found by Fauquembergue and Gérardin respectively. For 
detailed references to the work of Norrie and the last two authors and to much 
similar work, see Dickson, History, ii. 650-4. Lander and Parkin /Math. Computation 
21 (1967), 101-3] found the result which disproves Euler’s conjecture for k = 5, 
8 = 4. 


XXII 
THE SERIES OF PRIMES (3) 


22.1. The functions (x) and p(x). In this chapter we return to 
the problems concerning the distribution of primes of which we gave 
a preliminary account in the first two chapters. There we proved 
nothing except Euclid’s Theorem 4 and the slight extensions contained 
in §§ 2.1-6. Here we develop the theory much further and, in particular, 
prove Theorem 6 (the Prime Number Theorem). We begin, however, 
by proving the much simpler Theorem 7. 

Our proof of Theorems 6 and 7 depends upon the properties of a 
function ¥(x) and (to a lesser extent) of a function (x). We writet 


(22.1.1) Kx) = X logp = log] [p 
pee psg 

and 

(22.1.2) w(x) = $ logp = > A(n) 
pMsKx Nz 


(in the notation of § 17.7). Thus 
(10) = 3log 2-+-2 log 3+ log 5+log 7, 


there being a contribution log 2 from 2, 4, and 8, and a contribution 
log 3 from 3 and 9. If p™ is the highest power of p not exceeding x, 
log p occurs m times in ¥(x). Also p™ is the highest power of p which 
divides any number up to x, so that 


(22.1.3) y(x) = log U(a), 


where U(x) is the least common multiple of all numbers up to x. We 
can also express (xz) in the form 


(22.1.4) TED [ce oer. 
peu SP 


The definitions of (x) and y(x)are more complicated than that of m(x), but 
they are in reality more ‘natural’ functions. Thus h(x) is, after (22.1.2), the 
‘sum function’ of A(n), and A(n) has (as we saw in § 17.7) a simple generating 
function. Tho generating functions of B(x), and still more of x(z), are much more 
complicated. And even the arithmetical definition of d(x), when written in the 
form (22.1.3), is very elementary and natural. 


+ Throughout this chapter g (and y and t) are not necessarily integral. On the other 
hand, m,n, h, k, etc., are positive integers and p, as usual, is a prime. We suppose 
always that x > 1. 


22.1 (413-5)] THE SERIES OF PRIMES 341 


Since p? < 2, p? < x,... are equivalent to p < x}, p < xi,..., we have 
(22.1.5) pi) = Ar) AHA). = D Hav), 
The series breaks off when x" < 2, i.e, when 
log x 
a log 2° 


It is obvious from the definition that #(x) < x log x for x > 2. A fortiori 
B(xlm) < gM log x <a} log x 
ifm > 2; and S Haim) = Ofxi(log x)}, 


me 2 
since there are only O(log x) terms in the series. Hence 


THEOREM 4 13 : p(x) = d(x) + Ofxt(log x)?}. 
We are interested in the order of magnitude of the functions. Since 
= 21, He) = Llogp, 
PRE pss 

it is natural to expect Hx) to be ‘about loga times’ w(x), We shall see later that 
this is so. We prove next that (x) is of order x, so that Theorem 413 tells us that 
p(x) is ‘about the same as’ (x) when x is large. 

22.2. Proof that &(x) and (x) are of order x. We now prove 

THEOREM 414. The functions $(x) and (x) are of order x: 


(22.2.1) Ax < Wx) < As, Ax < b(t) < AX (x 22), 
It is enough, after Theorem 413, to prove that 
(22.2.2) Ba) < Ax 
and 
(22.2.3) w(x) > Ax (a BD 2). 


In fact, we prove a result a little more precise than (22.2.2), viz. 
THEOREM 415: &(n) < 2nlog2 for all n > 1 
By Theorem 73, 
M = (2m+1)! ant A E da 


m! (m+)! m 


is an integer. It occurs twice in the binomial expansion of (1+ 1)?"+1 
and so 2M < 22"+1 and M < 22”, 
Ifm+1<p <2m+1, p divides the numerator but not the denomi- 
nator of M. Hence 
( |x 
mt1<peam+1 
and 
BQm+1)—-Hm+1) = F logp < log M, < 2mlog 2. 
mt1<psamt+1 
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Theorem 415 is trivial for n = 1 and for n = 2. Let us suppose it 
true for all n < %)— 1. If ng is even, we have 


Bn) = HNry—1) < 2(my— log 2 < 2g log 2. 
If no is odd, say % = 2m-+1, we have 
Hn) = H2m-+ 1) = H(2m+1)—H m+ 1I)+h(m+4+ 1) 
< 2mlog 2+2(m-+ 1)log 2 
= 2(2m-+1)log2 = 2n,log 2, 


since m+] < n,. Hence Theorem 415 is true for n = ną and so, by 
induction, for all n. The inequality (22.2.2) follows at once. 

We now prove (22.2.3). The numbers 1, 2,,..,n include just [n/p] 
multiples of p, just [n/p?] multiples of p2, and so on. Hence 


THEOREM 416: nl= TI pin), 
P 
; n 
where jln, p) = [z]. 
. 2n)! 
We write N= ( = kp, 
(iy LLP 


so that, by Theorem 416, 


(22.2.4) kp = > (E-i) 


m=1 
Each term in round brackets is 1 or 0, according as [2n/p™] is odd or 
even. In particular, the term is 0 if p™ > 2n. Hence 


(22.2.5) ky < ea 
log 2n — 
and logN = > k logy < | Jog» mi (2n) 
nein Bee <2 log p 


by (22.1.4). But 


(2n)! n+l n42 2n 
(22.2.6) N= Ro a >m 
and so Y(2n) > nlog2. 


For x > 2, we put n = [42] > l and have 
y(x) 3 (2n) > nlog2 > frlog2, 
which is (22.2.3). 
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22.3. Bertrand’s postulate and a ‘formula’ for primes. From Theorem 
414, we Can deduce 


THEOREM 417. There ig a number B such that, for every x > 1, there is a prime 

p satisfying i ee BE 
For, by Theorem 414, 

Ciz < Kx) < Cx (x > 2) 


for some fixed C1, Oa. Hence 
B(C,2/C,) > C(C,2/C,) = Ot > Hx) 


and so there is a prime between x and C,2/C,. If we put B= max(C,/C;, 2), 
Theorem 417 is immediate. 
We can, however, refine our argument a little to prove a more precise result. 


THEOREM 418 (Bertrand’8 Postulate). If n > 1, there is at least one prime p 
such that 


(22.3.1) n < p < 2n; 
that is, ijp, is the r-th prime, 

(22.3.2) Pr+1 < 2p, 
for every f. 


The two parts of the theorem are clearly equivalent. Let us suppose that, 
for some 7 > 2° = 512, there is no prime satisfying (22.3.1). With the notation 
of $22.2, let p be a prime factor of N, so that ky > 1. By our hypothesis, p < n. 
If 3n < p < n, we have 


2p < 2n < 3p, p? > fn? > 2n 


e[l- <0 


Hence p < an for every prime factor p of N and so 


(22.3.3) PylOeP < Z, logp = ($n) < 4nlog 2 


pin 


and (22.2.4) becomes 


by Theorem 415. 
Next, if k, > 2, we have by (22.2.5) 
210gp < k logp < log(2n), P < (2n) 
and SO there are at most v( 2n) such values of p. Hence 


A kplogp < y(2n)log(2n), 
and so nes 


(22.3.4) logN < Ð logp+ È kplogp < < Z logp+y(2n)log(2n) 
kopl kp? 


< $n log2 + (2n )log( 2n) 
by (22.3.3) 
On the other hand, N is the largest term in the expansion of 2?" = (1+ 1)”, 


so that m 2 (2) (2) ades = i ) < nN. 
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Hence, by (22.3.4), 
2nlog2 < log(2n)+logN < gnlog2+{1+ (2n)}log(2n), 


which reduces to 


(22.3.5) 2nlog2 < 3{1+./(2n)}log(2n). 
; __ log(n/512) 
We now write C= “TOlog? > 0, 


so that 2n = 2104+09, Since n > 512, we have ¢ > 0. (22.3.5) becomes 
g10(1+¢) < 3K 25554 1) 1+ f), 


whence 
255 < 30.2-5(14 2-5-8814. 2) < (1— 275) 1-42-51 4-0) < 142. 
But 255 = exp(5{log2) > 1+5flog2 > 144, 


a contradiction. Hence, if n > 512, there must be a prime satisfying (22.3.1). 
Each of the primes 


2, 3, 5, 7, 13, 23, 43, 83, 163, 317, 631 


is less than twice its predecessor in the list. Hence one of them, at least, satisfies 
(22.3.1) for anyn< 630. This completes the proof of Theorem 418. 
We prove next 


THEOREM 419. If 


kes] 
a = 2 Pm 10-2" = -02030005000000070..., 
m= 


we have 
(22.3.6) Pr = [10"a]— 102"=[10" a]. 
By (2.2.2), Pm < gm — gmi 


and so the series for g is convergent. Again 


0o<10 $ p,lo-m™< E "zomm 
+1 


m=n m=n+1 
- > a a a <<! 
7 m=n+1 $ ? (1—4) as 
n 
Hence [10a] = 10” ¥ p,, 10-2" 
m=1 
n=l 
snd, similarly, [10% a] = 1077 E Pm 107”. 
: m=1 


It follows that 
n -1 
Dora- 10700] = 10"( $ pn 10- "S pn 10) = pr 
m=1 m=1 


Although (22.3.6) gives a ‘formula’ for the nth prime p,,, it is not @ very useful 
one, To calculate Dy from this formula, it is necessary to know the value of q 
correct to 2” decimal places; and to do this, it is necessary to know the values 


of Pis Pores Dns 


22.3 (420)] THE SERIES OF PRIMES 345 


There are a number of similar formulae which suffer from the same defect. 
Thus, let us suppose that r is an integer greater than One, We have then 


Pn LT’ 


by (22.3.2). (Indeed, for r > 4, -this follows from Theorem 20.) Hence we may 
write 


Ar = 5 Pmr ™ 
m=1 
and we can deduce that 


Pr = [ra] renr] 


by arguments similar to those used above. 

Any one of these formulae (or any similar one) would attain a different status 
if the exact value of the number q or a, which occurs in it could be expressed 
independently of the primes. Thero seems no likelihood of this, but it cannot 
be ruled out as entirely impossible. 


22.4. Proof of Theorems 7 and 9. It is easy to deduce Theorem 7 
from Theorem 414. In the first place 


B(x) = ¥ logy < lo 1 = n(x\logx 
(2) = 3 logp < loex > log 
and so 
Hx) Ax 
eae, > ———_, 
logz~ loga 
On the other hand, if 0 < 6 < 1, 
Kx) > > logp >(1—d)logx $ 1 


al—op<a wi-bcp<a 


(22.4.1) a(x) > 


= (1—8)log a{m(x)—n(at-8)} > (1—8)log a{a(x)—a1-} 
and so 


Hx) Ax 


(22.4.2) mx) < wt AEA Sjlogx © loge 


We can now prove 


THEOREM 420. n(z) ~ a) Ry re 


After Theorems 413 and 414 we need only consider the first assertion. 
It follows from (22.4.1) and (22.4.2) that 


i< n(x)log a loge 1 
SR) S Hx) 1-8" 


346 THE SERIES oF PRIMES [Chap. XXII 


For any «€ > 0, we can choose 6 = d(e) so that 
l 

1-8 

and then choose zo = x,(8, €) = xole) SO that 


< 1+}. 


at-§ log a Alogx_ i 
He) oe ~ * 


for all x > 2X». Hence 


for all x > 9. Since « is arbitrary, the first part of Theorem 420 follows 
at once. 

Theorem 9 is (as stated in § 1.8) a corollary of Theorem 7. For, in 
the first place, 


(Dn) < £2, Dy > Anlog p, > An logn. 


logp, 
Secondly n= n(p,) > APn 
i Pu Topp,’ 
Ap, 2 
so that Vp, < log <An, Py < An’, 
and Pa < Anlogp, < Anlogn. 


22.5. Two formal transformations. We introduce here two 
elementary formal transformations which will be useful throughout this 
chapter. 


THEOREM 421. Suppose that ¢,, ĉz,...is a sequence of numbers, that 
Cl) = È ons 
and that fit) is any function of t. Then 
22.5.1) enfin) = Z Ca-n +e]. 


If, in addition, ¢; = 0 for. j< ai and f © hasa continuous derivative for 
t > ny, then 


(22.5.2) PAS f (n) = C(x) f(x)— Í C(t) f’ (t) dt 


If we write N = [x], the sum on the left of (22.5.1) is 
WHO) nf) + ( OU YA 
(UES) —f(2)} +... + CON — IHAN —1) -FNH CIN) F). 


+ In our applications, n, = 1 or 2. If n, = 1, there is, of course, no restriction on the 
c,. If n, = 3, we have c = 0. 
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Since C(N) = C(z), this proves (22.5.1). To deduce (22.5.2) we observe 
that C(t) = C(n) when n <i < n+l and so 
ntl 


C(ny{f(n)—fn+1)} = — f Cos dt. 


Also C(t) = 0 when ?# < n,. 
If we put c, = 1 and f(t) = l/t, we have C(x) = [x] and (22.5.2) 


becomes zy 
1 [i t 2 
Sart lee 
Nz 1 
=- logz+y+E, 
where y=l— Í =u dt 
1 


is independent of x and 


ice} 


B= | Past Raof) - of) 


Thus we have 


1 1 
THEOREM 422: 23 = togz+y+0(3), 


where y is a constant (knoum as Huyler’s constant). 
22.6. An important sum. We prove first the lemma 


THEOREM 423: > 10g(2) = Olx) h > O). 


nee 


Since log ¢ increases with t, we have, for n > 2, 


x 
log” (5) < f log'(7) dt. 
n-i 
Hence z z 


{x} 

[x [£ logřu 
2 log’ () < Í log’ () di = 2 Í z du 
na T 


1 


m 
h 
È x [7 du = Ax, 
1 


uU 


since the infinite integral is convergent. Theorem 423 follows at once. 
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Ifweputh= l,wehave 
Dd logn = [x]logz+O(x) = xlogx+ O(a). 
nsx 
But, by Theorem 416, 


logn = Jlogp = Z |logp = ZIA 

doen = X jle] p)logp > =| p= > [z (n) 
psz NSU 

in the notation of § 17.7. If we remove the square brackets in the last 

sum, we introduce an error less than 


EAn) = He) = 
x 


and sO > = A(n) = ¥ logn+O(x) = xlogx+O(z). 
n NEE 
naag 
If we remove a factor x, we have 
THEOREM 424: Ati) = logæz+0(1). 
NEU i 
From this we can deduce 


we have 


n x 2 
3 


Ae $e), fH a 


and so, by Theorems 414 and 424, we have 
x 
A 
(22.6.1) Í Ht — logz+0(1). 
q 


From (22.6.1) we can deduce 
(22.6.2) lim{h(x)/z} <1, — limf(x)/a} > 
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For, if lim{y(x)/z} = 146, where § > 0, we have (x) > (1+435)% for 
all z greater than some Xp. Hence 


WO at > f Da f CHH ae > (14+ 48)loge—A, 
2 To 
in contradiction to (22.6.1). If we suppose that lim{yp(x)/x} = 1-8, 
we get a similar contradiction. 


By Theorem 420, we can deduce from (22.6.2) 


TuEorEM 426: imate) || <1, Lim [n() 


x 
l. 
= A 
If n(e) [is tends toa limit as x > œ, the limitis 1. 
og x 


Theorem 6 would follow at once if we could prove that r(x) / ae 


log x 
tends to a limit. Unfortunately this is the real difficulty in the proof 
of Theorem 6. 


22.7. The sum p- and the product [Į (l-p-1). Since 


L 1 1 1 
22.7.1 0 2 login aa 
ee = l =) Dp 2p? aps 

1l 1l evil 

2p 2p Fp(p—1) 

1 
and Se 
eT 


is convergent, the series 


2 (ee) 3] 


must be convergent. By Theorem 19, > p~! is divergent and so the 
product 


(22.7.2) TI 0—77) 
must diverge also (to zero). 


From the divergence of the product (22.7.2) we ean deduce that 
a(x) = o(z), 
Le, almost all numbers are composite, without using any of the results of §§ 22.1-6. 
Of course, this result is weaker than Theorem 7, but the very simple proof is of 
some interest. 
If w(x,7) is the number of numbers which (i) do not exceed x and (ii) are not 


divisible by any of the first y primes Pis Pores Pr then 


n(x) < wlx, r) +r 
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and, by Theorem 261, 


w(x, 7) = m- > El lear 


where i, j,... arc unequal and run from ltor, The number of square brackots is 


ral) ion 


snd thc error introduccd by the removal of a square brackct is less than 1. 


Hcnce 

r 

x x 1 
zr) <a- = =H = 2 1——}+42" 
mle) Dae Dee jE II ae 
and m(x) < x JI -p+ +r. 
PPr 
Since [] (1 -- p~?) diverges to zero, we can, for any e > 0, choose r = r(e) so that 
TI (l—p") < te 
PSr 

and n(x) < ext2 +r < ex 


for x > x(e, r) = zale). Thus a(x) =0 (x). 
We can prove the divergence of [[(1 —-') independently of that of 
> p-as follows. It is plain that 


1 ( 1 1 1 
| | een ee | | Lt ett) = =i 
—p-l 2 
pEN h Pp PSN P P Dn 
the last sum being extended over all n composed of prime factors p <N. 


Since all n <N satisfy this condition, 


De 3) > >; PISE 


by Theorem 422. Hence P produci (22.7 .2) is divergent. 
If we use the results of the last two sections, we can obtain much 
more exact information about $ p-!. In Theorem 421, let us put 
= log p/p, and c, = 0 if n is not a prime, so that 


log p 
Cx) = —* = logr+7(z), 
where 7(z) = O(1) by Theorem 425. With f(t) = l/logt, (22.5.2) þe- 
cornes 
T 
1 _ C(x) C(t) 
AP ~ logx , t logt 


xr T 
— 17) Í dt IE dt 
lage logx + tlogt ee tlog*t 
2 2 


(22.7.3) 


= loglogz+ B,+ E(z), 
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r(t) dt 
tlog*t 


where B, = 1—loglog 2 + fi 


and 


ore) fdt ofl fat \ of 1 
(22.7.4) E(x) = logs { tlog®t = Olea) +o Í ata) a ofz l 
z s l 


Hence we have 


THEOREM 427 : 


= loglogx+ B,+0(1), 


J= 


psat 


where B, is a constant. 


22.8. M ertens’s theorem. It is interesting to push our study of 
the series and product of the last section a little further. 


Tueorem 428. In Theorem 427, 


(22.8.1) By=y+> fiog(ı — 3) F 3}, 


where y is Euler’s constant. 


THeoreM 429 (Mertens’s theorem) : 


MPE i 
i p log x 


As we saw in § 22.7, the series in (22.8.1) converges. Since 


1 1 1 1 
oat >, elt 3) = 2 fiog(1 —};) +3), 


Theorem 429 follows from Theorems 427 and 428. Hence it is enough 
to prove Theorem 428. We shall assume thatt 
(22.8.2) y= 170) = — f eloge dz. 
0 
If 8 > 0, we have 
1 
2p(p—1) 


by calculations similar to those of (22.7.1). Hence the series 


F@)= 5 fiog( 1— a) + -in 


p 


S 1 
o < —log 1 ia- “pis < gp1t8(pi+8—T) S 


+ See, for example, Whittaker and Watson, Modern analysis, ch. xii. 
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is uniformly convergent for all § > 0 and so 
F(8) > F(0) 


as ô > 0 through positive values. 
We now suppose § > 0. By Theorem 280, 


F(8) = 9(8)—log (1 +8), 
where g(S) = $ p71. 
P 
If, in Theorem 421, we put ¢, = 1 /p and c, = 0 when n is not prime, 
we have 


cw) = > l= loglog x-+ B,-+ E(2) 


pa 


by (22.7.3). Hence, if f(t) = t, (22.5.2) becomes 


> ptt = aia) ata fi t-1-èC (t) dt, 


pee 


Letting x -> CO, we have 


g(S) = è l t-1-8C(t) dt 


= 8[ t-8(loglog t+ B,) dt+8 f t-1-8 E(t) dt 
2 2 


Now, if we put t = ew, 
8 f t logloge dt = [ e~log(5) du = —y—log8 
l ò 


by (22.8.2), and 8 Les dt = 1. 
1 


Hence 
ive) 2 
g(8)+log8~B,+y = 8 | t-1-3B(t) dt—8 f t-1(loglog t+ B,) dt 
2° 1 


Now, by iis, if T= exp(1/v6), 
i dt AS f dt 
ef ae Qalx aji +ioge | ins 


< Alog P+; 2 
0g 


ns Avs > 0 
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as 630. Also 


y 2 
| t-1-3(loglog t+ B,) dt < | é-\(loglog t|+ |B,|) dt = A, 
1 


1 


since the integral converges at {= 1. Hence 
g(8)+log è > B,—y 


as 6 > 0. 
But, by Theorem 282, 
log £(1+-6)+log ô —> 0 
as S + 0 and so (8) > By—y. 
Hence Bı = y+ F(0), 


which is (22.8.1). 
22.9. Proof of Theorems 323 and 328. We are now able to prove 

Theorems 323 and 328. If we write 
n)eY loglog n 

a ae, | 


fi(n) = 4 = 
we have to show that 
lim fi(n) = 1, lim f,(n) = 1. 


It will be enough to find two functions F (t), F(t), each tending to las 
t > % and such that 


ol) 
A = er ogg” 


1 
(22.9.1) Am) 3 Ailogn), fan) < sgn) 
for all n > 3 and 

: l 
(22.9.2) fan) > PA) faln) < Fj) 


for an infinite inereasing sequence na, ng, ng. 

By Theorem 329, f,(n)f,(m) <1 and so the second inequality in 
(22.9.1) follows from the first; similarly for (22.9.2). 

Let Py; Pose) Pr-p be the primes which divide n and which do not 
exceed logn and let P,-p+1 P, be those which divide n and are 
greater than logn. We have 

logn 
loglog n 


and so t= [10-3] > Oise) LL -3) 
Ne 


fı 1 À niloglog n 
> 
p<logn 


(log n)? < Pr-p+1-Pr <s n, 


~ logn 
5591 Aa 
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Hence the first part of (22.9.1) is true with 


togt 
F(t) = evlogd 1-7) ý [](!-3): 


p<t 
But, by Theorem 429, as t > 0, 


l tlogt 1 
ee ae Zn 1 
Fit) ~f | 1 +0 (is > 


To prove the first part of (22.9.2), we write 
ny = Te’ (j > 2), 
so that log n; = jH(e?) < Aje 
by Theorem 414. Hence 
loglogn; < Ag+j+log). 


Again TI 0—7) > TI d-p-i-9 = 1 
pee FT) 
by Theorem 280. Hence 


o om) ev ( pl 
Jan) = n; €” loglog n;~ loglog r; II 1—p-! 


pee 


e-Y l . 
2 - : Fj) 
umaa | | =a) - i 
(say). This is the first part of (22.9.2). Again, as j > 00, &(j+1)—> 1 
and, by Theorem 429, 
F, j aeea ees l. 
(D ~ Fa, og) 
22.10. The number of prime factors of n. We define w(n) as the 


number of different prime factors of n, and R(n) as its total number of 
prime factors; thus 


wn) =r, Qn) = ayta,+...ta,, 
when n = p%...pfr, 
Both w(n) and Q(n) behave irregularly for large n. Thus both func- 
tions are 1 when n is prime, while 
logn 
Ne log 2 
when n is a power of 2. If 


n = Py PoP, 
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is the product of the first y primes, then 
w(n)= r= n(p,), logn = Hp,) 
and so, by Theorems 420 and 414, 
(when n -> œ through this particular sequence of values). 
TuEorEM 430. The average order of both w(n) and Q(n) zs loglogn. 
More precisely 
(22.10.1) > w(n) = xloglogz+B,x+0(z), 


Ne 


(22.10.2) YQA(n) = xloglogr+ B x+o (x), 


nET 


where B, is the number in Theorems 427 and 428 and 


l 
Bie Bit D opm 


We write 8, = > a(n) = > s1= > |Z], 
nsx pin pat 


since there are just [x/p] values of n < x which are multiples of p. 
Removing the square brackets, we have 


(2.10.3) S= > prone) = xloglogr+B,2+0(z) 
pee 


by Theorems 7 and 427. 


Similarly 
(22.10.4) S= YAn)=y i= > Fal 
nsr n<xı p™|n paz 
so that S,—S, = > [z/p"], 


where $’ denotes summation’over all p™ < x for which m > 2. If we 
remove the square brackets in the last sum the error introduced is less 


than y l< 1 logp _ (x) — B(x) 


‘log 2 log 2 
by Theorem 413. Hence 
S,—S, = x y p™+o (2). 


Den Dirt pt-}= Lapa = Be 


= 0 (z) 


The series 
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is convergent and so 
yp” = B,—B,+0(1) 


8,—S, = (B,—B,)x+o (x) 
and (22.10.2) follows from (22.10.3). 


as x -> œ. Hence 


22.11. The normal order of w(x) and Q(n). The functions w(n) 
and Q(n) are irregular, but have a definite ‘average order’ loglogn. 
There is another interesting sense in which they may be said to have 
‘on the whole’ a definite order. We shall say, roughly, that f(n) has 
the normal order F(n) if f(m) is approximately F(n) for almost all values 
of n. More precisely, suppose that 
(22.11.1) (l—e)F(n) < fin) < (1 +e) F(n) 
for every positive ¢ and almost all values of n. Then we say that the 
normal order of f(n) is F(n). Here ‘almost all’ is used in the sense of 
§§ 1.6 and 9.9. There may be an exceptional ‘infinitesimal’ set of n for 
which (22.11.1) is false, and this exceptional set wil] naturally depend 
upon e, 

A function may possess an average order, but no normal order, or 
conversely. Thus the function 


fín) = 0 (n even), fn) = 2 (n odd) 
has the average order 1, but no normal order. The function 
fn) = 2" (n = 2m), fin) = 1 (n & 2”) 


has the normal order 1, but no average order. 


Tuzorem 431. The normal order of w(n) and Q(n) is loglogn. More 
precisely, the number of n, not exceeding x, for which 
(22.11.2) |f(n)—loglog n| > (loglog n)#8, 
where f(n) is w(n) or Q(n), is 0(x) for every positive §. 

It is sufficient to prove that the number of n for which 
(22.11.3) |f(n)—loglogx| > (loglog x)#+8 
is o(x); the distinction between loglog n and loglog x has no importance. 
For loglogz—1 < loglogn < loglogx 


when gle < n <x, SO that loglog n is practically loglogx for all such 
values of n; and the number of other values of n in question is 


O(a) = 0 (x). 


22.111 THE SERIES OF PRIMES 357 


Next, we need only consider the case f(n) = w(n). For Qin) > w(n) 
and, by (22.10.1) and (22.10.2), 


$ {Q(n)—w(n)} = Olx 
Ne 
Hence the number of n < x for which 


Q(n)-w(n) > (loglog x)! 


y zr 
is oza] = o (x); 
so that one case of Theorem 431 follows from the other. 

Let us consider the number of pairs of different prime factors p,q of 
n (ie. p Æ q), counting the pair g, p distinct from p, g. There are 
w(n) possible values of p and, with each of these, just w(n) ~- 1 possible 
values of g. Hence 


w(n)fw(n)—-1} =f l= 1- $1. 
em-D = Ble Z, 
Summing over all n < x, we have 


Zen- zoms- ¥1)= > jl- > [a] 


pasar psr 


First Sea ebe 


PSL pea? p 


since the series is convergent. Next 


AE 


PAST PIS 
Hence, using (22.10.1), we have 


l 
(22.11.4) wn) = z — -- Q(x loglog x). 
PAGO) Dae (x loglog z) 
Now 
1\? l 1\2 
(22.11.5) ( z) <S <s ( 5) , 
Qui) s 2a S\2p 


since, if pg < x, then p < x and q < x, while, if p < vg and q < vz, 
then pq < x. The outside terms in (22.11.5) are each 
{loglog x+ O( 1)}” = (oglog x)?+ O(loglog x) 
and therefore 
(22.11.6) > {w(n)}? = x(loglog x)?+ O(a loglog x). 
ngr 
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It follows that 
(22.11.7) © {w(n)—loglog x}? 
nea 
= 2 {w(n)}?— 2 loglog x F w(n)+[x](loglog x)? 
NSE nex 
= x(loglog x)?-+ O(x loglog x)— 
-2 loglog z{z loglog x+ O(z)}+{x-+-O( 1)}(loglog x)? 

x(loglog x)*— 2x(loglog x)?-+-a(loglog x)?+ O(a loglog x) 
= O(a loglog x) 


by (22.10.1) and (22.11.6). 
If there are more than yx numbers, not exceeding x, which satisfy 
(22.11.3) with f(n) = w(n), then 


> {w(n)—loglog x}? > nx(loglog x)!+, 
NEE 


which contradicts (22.11.7) for sufficiently large x; and this is true for 
every positive 7. Hence the number of n which satisfy (22.11.3) is 
o(x); and this proves the theorem. 


22.12. A note on round numbers. A number is usually called 
‘round’ if it is the product of a considerable number of comparatively 
small factors. Thus 1200 = 24.3. 5? would certainly be called round. 
The roundness of a number like 2187 = 37 is obscured by the decimal 
notation. 

It is a matter of common observation that round numbers are very 
rare; the fact may be verified by any one who will make a habit of 
factorizing numbers which, like numbers of taxi-cabs or railway 
carriages, are presented to his attention in a random manner. Theorem 
431 contains the mathematical explanation of this phenomenon. 

Hither of the functions w(n) or Q(n) gives a natural measure of the 
‘roundness’ of n, and each of them is usually about loglogn, a function 
of n which increases very slowly. Thus loglog 10’ is a little less than 3, 
and loglog 108° is a little larger than 5. A number near 10? (the limit 
of the factor tables) will usually have about 3 prime factors; and a 
number near 108° (the number, approximately, of protons in the uni- 
verse) about 5 or 6. A number like 


6092087 = 37.229.719 
is in a sense a ‘typical’ number. 


These facts seem at first very surprising, but the real paradox lies a 
little deeper. What is really surprising is that most numbers should 
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have so many factors and not that they should have so few. Theorem 
431 contains two assertions, that w(n) is usually not much larger than 
loglogn and that it is usually not much smaller; and it is the second 
assertion which lies deeper and is more difficult to prove. That w(n) 
is usually not much larger than loglogn can be deduced from Theorem 
430 without the aid of (22.11.6).+ 


22.13. The normal order of d(n). If n = p% p?...p%, then 

win) = r, Q(n) = @&-+a,+...4¢@,, dm) = (1+a,)(1+a,)...(1+4,). 
Also 2< lta < 24 
and Quin) < d(n) < 2, 
Hence, after Theorem 431, the normal order of log d(n) is 

log 2 loglog n. 

THEOREM 432. If e is positive, then 
(22.13.1) Ql-eloglogn < qin) < Q(ltelogiogn 
for aimost all numbers n. 

Thus d(n) is ‘usually’ about 

Qloglogn — (log n)!98? == (log n) ®®, 

We cannot quite say that ‘the normal order of d(n) is 2!°8!e”’ since the 
inequalities (22.13.1) are of a less precise type than (22.11.1); but one 
may say, more roughly, that ‘the normal order of d(n) is about 2loglogn’, 


It should be observed that this normal order is notably less than 
the average order logn. The average 


“(d(1)+d(2)+...+d(n)} 


is dominated, not by the ‘normal’ n for which d(n) has its most common 
magnitude, but by the small minority of n for which d(n) is very much 
larger than logn.{ The irregularities of w(n) and Q(n) are not suffi- 
ciently violent to produce a similar effect. 


22.14. Selberg’s Theorem. We devote the next three sections to 
the proof of Theorem 6. Of the earlier results of this chapter we use 


} Roughly, if x(x) were of higher order than loglog x, and w(n) were larger than 
x(n) for a fixed proportion of numbers less than 2, then 
> (n) 
nga 


would be larger than a fixed multiple of x x(x), in contradiction to Theorem 430. 
$ See the remarks at the ends of §§ 18.1 and 18.2. 
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only Theorems 420-4 and the fact that 
(2214.1) plx) = O(2), 
which is part of Theorem 414. We prove first 
THEOREM 433 (Selberg’s Theorem): 
(22.14.2) (x)log z+ > A(n)b iz 2xlogz+0(zx) 
and ue "A 
(22.14.3) SA(n)logn+ FY A(m)A(n) = 2zlogz+0(x). 
ngr 


MNSE 
It is easy to see that (22.14.2) and (22.14.3) are equivalent. For 
> aahi) = FAO, Z,A = ZAA 


maz NE mean MNSE 
ns 


and, if we put Cp, = A(n) and f(t) = logt in (22.5.2), 


g 
(22.14.4) 2 A(n)logn = %(x)logx— lF dt = p(x)logz+0(x) 


by (22.14.1). 
In our proof of (22.14.3) we use the Mobius function p(n) defined in 
§ 16.3. We recall Theorems 263, 296, and 298 by which 


(22.145) Yuld)= 1 (n= 1), Zud) = 0m >), 
din 7 
(22.14.6) A(n) = ~ Yp(djlogd, logn = x A(d). 
aln n 
Hence 
(2.14.7) > AGA r = m &Ah) X p(d)logd 
hin 0 ale 


- — Ypldlogd $ Ah) = ~=  nlahlogdlog(*) 
d\n din 


hie ] 
= A(n)log n+ Š p(d)log?d. 
Again, by (22.14.5), 
p(d)log? od = log’, 


ait 
but, for n > 1, 


S sanoe )= = Y p(d)llogd —2 log æ log d) 
din = 2A(n)log x—A(n)log n+ > A(h)A(k) 
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by (22.14.6) and (22.14.7). Hence, if we write 
= >> alatog*(5), 
nex din 
we have 


S(x) = log?x-+ 24(x)logz— $ A(n)log n+ 2 A(h)A (k) 
= 2 A(n)log n + 2 Alma) +0(z) 


by (22.14.4). To complete the proof of (22.14.3), we have only to show 
that 


(22.14.8) S(x) = 2xlogx+O(2). 
By (22.14.5), 
se- = >> u(a)[toee(3) —) 
nsz din 
= > ot 


since the number of n < x, for which d n, is [z/d]. If we remove the 
square brackets, the error introduced is less than 


> [iog(3) +77] = of) 
asx d 

by Theorem 423. Hence 

(22.149) S@)=x X Me lowes )- *} + 04). 


áz 


& 


Now, by Theorem 422, 

(22.14.10) hog) - z) 
—_ 5 (d) x 1 g 
= ur ee(G)— r| 2. ETO) 


The sum of the various error terms is at most 


eum 5 illee(a) +0 () 


dsx 


ll 
QO 
—_ 
| = 
~~ 
— 
© 
gg 
Pare 
AIR 
ence” 
+ 
S 
— 


Il 
Q 
~~ 
a 
wm 
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by Theorem 423. Also 


(22.14.12) 2 T le) —7] > : 


k<ald 
“as 2 ie loela J- |= >> uld)(log(3) — »} 
== logr—y+ > A) = 2logx+0(1) 
2<KNSKT 


by (22.14.5), (22.14.6), and Theorem 424. (22.14.8) follows when we 
combine (22.14.9)-(22.14.12). 


22.15. The functions R(x) and P (¢). After Theorem 420 the Prime 
Number Theorem (Theorem 6) is equivalent to 


THEOREM 434: (x) ~w z, 
and it is this last theorem that we shall prove. If we put 
W(x) =- 2+R(z) 
in (22.14.2) and use Theorem 424, we have 
(22.15.1) R(x)logx + > A(R $ = = O(x). 
nor 


Our object is to prove that R(x) = o (x).t 
If we replace n by m and x by 2/n in (22.15.1), we have 


He Sama) =o 


mrin 


Hence 
logx| R(a}log x + Z, Awaz 3) - 


- aoaea) Samal) 


NEE MITIN 
= O(elog2)+O(2 » am = O(zlogz), 
that is w 
R(x)log’x = — > An) (= oent 2 m)A(n) r2 Z) +0lloga), 


+ Of course, this would be a trivial deduction if R(z) > 0 forall g (or if R(x) < 0 
for all z). Indeed, more would follow, viz. R(x) = O(x/log x}, But it is possible, So far 
as we know at this stage of our argument, that R(z) is usually of order 2, but that its 
positive and negative values are so distributed that the gum over n on the left-hand 
side of (22.15.1) is of opposite sign to the first term and largely offsets it. 
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whence 
(22.15.2) | R(x) |log2x < > a, R(2)| +0 log x), 
where a, = A(n)logn+ > A(h)A(k) 
hk=n 
and > a, = 2xlogx+O(x) 
NE 
by (22.14.38). 


We now replace the gum on the right-hand side of (22.15.2) by an 
integral. To do So, we shall prove that 


(22.15.3) an R() =2 f 


We remark that, if ¢ >t’ > 0, 
IROI IROI < RO-RO = |P—ye) t+ 
< tpt) = F)—F’), 
where F(t) = £()+t = O(t) 
and F(t) is a steadily increasing function of ¢, Also 


(2.15.4) > ee-e- > Fa) ra 


nxr-i 
l 
= ofz > 3) = O(xlogz). 


Naz 


R(?) logt dt+ O(x log x). 


We prove (22.15.3) in two stages. First, if we put 


n 


in (22.5.1), we have 


and . R 
(22.15.5) > an R(?) 2 > (7) J towe ae 
nea 2cncz Pel 
= 2 eoa- e+ ee 


by (22.15.4). 
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oft] fisia fx f 
< ĵo- 
< | (Al) ue a-fe) 
ne) fera | fe 


x 


= o| 2. alr(2) — r|- =o) + O(zlog x) = O(x log x). 


Combining (22.15.5) and (22.15.6) we have (22.15.3). 
Using (22.15.3) in (22.15.2) we have 


NER 


logt dt 


z): dt 


log t dt 


2ong2x 


(22.15.7) /R&) |log?x < 2 Sh Foe’ dt + O(x log x). 


We can make the significance of this inequality a little clearer if we 
introduce a new function, viz. 


(22.15.8) V(é) = ef R(ee) = e-Sp(eb)—1 
= = ef 2: A(n (n)}—1 


neg 


If we write x = e§ and { = xe~", we have 


fal 


é é £ 
logt dt =x | |V(n)|(E—n) dn =æ | |V(m)| | didn 
| e 


¿E t 
=a] Ooa 


on changing the order of ia (22.15.7) becomes 


(22.15.9) &/V(é)| af fim 7)| dndt + O(€). 
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Since y(x) = O(x), it follows from (22.15.8) that V(é) is bounded as 
é > œ. Hence we may write 


é 
«=O B= m \V(n)| dn, 


since both these upper limits exist. Clearly 
(22.15.10) IV(é)| < at+o(1) 


$ 
and f IV) dn < Bé+o (6). 
0 
Using this in (22.15.9), we have 


é 
BIV(E)| <2 f (Bl+0(0)} al+0) = Be+0(@) 
0 


and so IV(é) < +o (1). 
Hence 
(22.1511) a < Bp. 


22.16. Completion of the proof of Theorems 434, 6, and 8. 
By (22.15.8), Theorem 434 is equivalent to the statement that V (£) > 0 
as £ + œ, that is, that x = 0. We now suppose that œ > 0 and prove 
that, in that case, 8 < « in contradiction to (22.15.11). We require two 
further lemmas. 

Tuzoremm 435. There is a fixed positive number A,, such that, for every 
positive £j, &, we have 


|| Pen dy) < Av 


If we put x = e£, t = en, we have 


from ay = | (Nar ol) 
0 1 


by (22.6.1). Hence 
fs b, é 
| V) dn = Fro dy— f V(x) dy = O(1) 
é 0 0 
and this is Theorem 435. 
Torm 436. Ify > 0 and V(n) = 0, then 


a 


f [V(qo-+7)| dr < 402+ O(n5}). 
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We may write (22.14.2) in the form 
(x)log 1+ È A(m)A(n) = 2x log a+ O(z). 
MNT 


If x > X È 1, the same result is true with 2, substituted for x. Sub- 
tracting, we have 
#(a)log x—yp(xp)log xy + 2 g A(m)A(n) = 2(xlogz—z,log zọ)+0(x). 
LoC MNS E 
Since A(n) > 0, 
0 < X(x)log x—ab(Xp)D gx) < 2(xlogx—zx, log x))+O(z), 
whence 
| R(x)log x— R(xy)log zo| < xlog x—z, log x)+ O(z). 
We put x = e%+7, x = e%, SO that R(x) = 0. We have, since 
0<r<a, 
l 
roca <i (aso 
Vonta < 1- (22 le-r- 0 
= 1-e-7+ O(1/79) < t+ O(1/%9) 
and so 


f Pnot7)| dr < f- dr+0(—-) = iw +0(-, 


5 3a7+4A, 


We now write 


2a oo 


take { to be any positive number and consider the behaviour of V(q) 
in the interval č < n <{+6—a. By (22.15.8), V (y) decreases steadily 
aS 7 increases, except at its discontinuities, where V(y) increases. 
Hence, in our interval, either V(y9) = 0 for some ng or V(n) changes 
sign at most once. In the first case, we use (22.15.10) and Theorem 436 
and have 


+å Ne Nota C+8 
f Wildy = f+ f + f Vian 
| t no nota 


< a(no— t) +40? +alt+8— na) +0 (1) 
= a(8—4a)+0(1) = &'ð+o0(1) 


for large {, where a! = of 1 5) <a. 


In the second case, if V (y) changes sign just once at 7 = 7, in the 
interval ¢ < n < €+6—.«, we have 
¢+8-« 


Ma C+8-« 
| idn = | fY dr|+ 
č 4 


J 


1 


V(n) an) < 2A), 
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while, if V(7) does not change sign at all in the interval, we have 


£+8-a f+8-a 
[ V da= | Vi) an) < Ay 
i E 
by Theorem 435. Hence 
+â E+8-a +8 
j V) dy = i + | IV) dy < 24, +0%+0(1) = «”3+0(1), 
t 4 +8 —a 
»_ 24ta _ [4A,4+207\ | CADE 
where a” = 5 = (| = (1-55) = a’, 
Hence we have always 
g+ 
f IVI dn < 0/8 +0(1), 
č 
where o(1) > 0 as ¢ > œ. If M = [€/8], 
é ma mt vs E 
[Voda = 5 f IVl dnt f IVC) da 
ò m=O ns M5 
< a'M8+0(M)+O(1) = a’+0(E). 
Hence p= tm, f [Vin)| dn <a’ <a, 
ò 


in contradiction to (22.15.11). It follows that ~ = 0, whence we have 
Theorem 434 and Theorem 6. As we saw on p. 10, Theorem 8 is a trivial 
deduction from Theorem 6. 


22.17. Proof of Theorem 335. Theorem 335 is a simple conse- 
quence of Theorem 434. We have 


> a(n)log( =) = O(a) 


NRX: 


by Theorem 423 and so 
M(z)logxz= > p(njlog n+ O(a). 
nsr 
By Theorem 297, with the notation of § 22.15, 


— E,noniogn = > > TANG = F AC) 
= n<a din ey : 
= >, most) => acene(| I) 


~ 5 mofi] + mcryn (|) = S48 


ksx ksx 
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(say). Now, by (22.14.5), 
x 
s=) a(e)| =| = S Yew =1. 
ke nex kin j 


By Theorem 434, R(x) = o(3); that is, for any eœ 0, there is an integer 
N = M(e) such that | R(x) < ex for all x > N. Again, by Theorem 414, 
|R(x)| < Ax for all x > 1. Hence 


a< lel Zr El 2. Ale 


k&a k<TIN 

et log(x/N)+ Axflog x«—log(x/N)}+ O(z) 

et log x+-O(z). 

Since ¢ is arbitrary, it follows that 8, = o (xlogx) and so 
—M(x)logx = §,+8,+ O(x) = o(xlogz), 


whence Theorem 335. 


MN 


22.18. Products of k prime factors. Let k > 1 and consider a 
positive integer n which is the product of just k prime factors, i.e. 


(22.18.1) N= Py Dov-Dye 


In the notation of § 22.10, Q(n) = k. We write 7,(x) for the number 
of such n < x. If we impose the additional restriction that all the p 
in (22.18.1) shall be different, n is quadratfrei and w(n) = Q(n) = k. 
We write 7,(x) for the number of these (quadratfrei) n <x. We shall 
prove 

a(loglog x)*-1 


THEOREM 437 : a, (a) mw 7) yoga 


(k > 2). 


For k = 1, this result would reduce to Theorem 6, if, as usual, we 
take O! = 1. 


To prove Theorem 437, we introduce three auxiliary functions, viz. 


Le) «SMe LL Hele) - I WowtrrPa-Pe) 


where the summation in each case extends over all sets of primes 
Pis Posy Py Such that p,...p, < X, two sets being considered different 
even if they differ only in the order of the p. If we write c, for the 
number of ways in which n can be represented in the form (22.18.1), 
we Dave m) - > y(n) = F en logn. 

NS 


C. 
= n 
nsr 
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If all the p in (22.18.1) are different, c, = k!, while in any case 
C, <k!. If n is not of the form (22.18.1), ¢, = 0. Hence 
(22.182) k!m,(x) < I(x) < k'r) (k > 1). 
Again, for k > 2, consider those n which are of the form (22.181) with 
at least two of the p equal. The number of these n < g is 7;,(%)—7;,(2). 
Every such n can be expressed in the form (22.18.1) with p,_, = Pk 
and so 


(22.18.38)  7,(x)—m,(%) < 1 < 5 l= Taz) 
PPa Dey SL Pr Do. Pk SE (k > 2), 
We shall prove below that 
(22.18.4) Dlx) ~ ka(loglogx)F-1 (k > 2). 
By (22.5.2) with f(t) = logt, we have 


x 
Dlx) = TI, (x)log x— ae 


(22 
jn 
2 
Hence, for k > 2, i i = 
Plr) x x(loglog x)*- 
ii O SS OO a A 
(22.18.5) TI,,(x) one lege Dez 
by (22.18.4). But this is also true for k = 1 by Theorem 6, since 
I(x) = r(x). When we use (22.18.5) in (22.18.2) and (22.18.3), 
Theorem 437 follows at once. 
We have now to prove (22.18.4). For all k > 


Now r(x) < x and so, by 8.2), II, (¢ = O(t) and 


k?y (2) a pa {log(p.2 Pz- peed Pri) + 
ree +... 10g (py P-P) 
= er) Pee log(P2Ps--Pisa) = (+N) > oz 


and, if we put L,(s) = 1, 
1 
L(x) = > = 1 Ly. =) 


Pye PREX Pi Pr PiSx Pi 


Fike) - Prle) —kxLy,_,(2), 


Hence, if we write 


we have 
(22.18.6) kfysy(#) = (k+1) fA —). 
k+l > d ) 


6681 Bb 
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We use this to prove by induction that 


(22.187) f(z) = ofx(loglogx)F-4 k > D. 
First f(z) = 8,(x)—x = 79(x)-x = of(x) 
by Theorems 6 and 420, so (22.187) is true for k = 1. Let us 
suppose (22.187) true for k = K >1 So that, for any e > 0, there is 


an t= a,(K,®  ) such that 
lfx(x) < ex(loglog x)*- 
for all x > x. From the definition of fgíx), we see that 
lfx(z)| < D 
for 1 Kx < £p where D depends only on K and e. Hence 


f(z) < e(loglog x)X-! > z 


PS T|To PS EjTo P 


< 2ex(loglog x)* 
for large enough x, by Theorem 427. Again 


x 
ful 2) 
TİL PLE p 


Hence, by (22.18.6), since K+1 < 2K 
\fc+i(%)| < 2a{e(loglog x)*+D} < 5ex(loglog x)* 


< Da(x) < Dez. 


for x > xı = æ% (e, D, K) = x (e, K). Since e is arbitrary, this implies 
(22.18.7) for k = K+1 and it follows for all k > 1 by induction. 

After (22.18.7), we can complete the proof of (22.18.4) by showing 
that 
(22.18.8) L,(x) ~ (loglogx)* (k > 1). 


In (22.18.1), if every p; < x¥*, then n < x; conversely, ifn < x, then 


Pı < x for every i. Hence 
3j 1\k 
eLo g ( 3) , 
( k 
2 = 2 p 


PS 


But, by Theorem 427, 


IN 


>, E mw loglog z, Ss a m lo of w loglogx 


psa plan P 


and (22.18.8) follows at once. 
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22.19. Primes in an interval. Suppose that ¢ > 0, so that 
g+ er 


x x 


ex x 
= ings t (ices): 
The last expression is positive provided that 7 > Xo(e). Hence there is always 
a prime p satisfying 
(22.19.2) x < p< (l+e)x 
when 2 > v(e). This result may be compared with Theorem 418. The latter 


corresponds to the case € = l of (22,19.2), but holds for al} x > 1. 
If we put ¢ = 1 in (22.19.1), we have 


: x æ 
(22.19.3) m(2x)— rlz) = Ea ae ~ (2x). 
Thus, to a first approximation, the number of primes between x and 2x is the 
same as the number less than x, At first sight this is surprising, since we know 
that the primes near y ‘thin out’ (in some vague sense) as x increases. In fact, 
mà 2x)- 2m(x) + —00 as x -> œ (though we cannot prove this here), but this is 
not inconsistent with (22.19.3), which is equivalent to 
m(2x)— m(x) = ofn(x)}. 

22.20. A conjecture about the distribution of prime pairs 
p, p+2. Although, as we remarked in § 1.4, it is not known whether 
there is an infinity of prime-pairs p, p+ 2, there is an argument which 


makes it plausible that 
(22.20.1) r eet Ld 
ee ‘ (log x)?’ 


where P,(x) is the number of these pairs with p < x and 
p(p—2) 1 
(22.20.2) C, = | | 2 fı z a 
Laie] = LT gap 
We take x any large positive number and write 
N=[Ip 


pSr 


We shall call any integer n which is prime to N, i.e. any n not divisible 
by any prime p not exceeding vg, a special integer and denote by S(X) 
the number of special integers which are less than or equal to X. By 
Theorem 62, 1 
S(N) = $(N) = N (1-2) = NB% 


(say). Hence the proportion of special integers in the interval (1, N) 
is B(x). It is easily seen that the proportion is the same in any com- 
plete set of residues (mod N) and so in any set of rN consecutive 
integers for any positive integral r, 
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If the proportion were the same in the interval (1, x), we should have 


S(x) = z B(x) ai 


ia log x 

by Theorem 429. But this is false. For every composite n not exceed- 
ingx has a prime factor not exceeding vx and so the special n not exceeding 
xz are just the primes between yy (exclusive) and z (inclusive). We 


have then 
S(x) = n(x)—n(Wr) ~ —— 


log x 

by Theorem 6. Hence the proportion of special integers in the interval 
(1, x) is about łe” times the proportion in the interval (1, N). 

There is nothing surprising in this, for, in the notation of § 22.1, 

log N = (vx) ~ vx 

by Theorems 413 and 434, and so N is much greater than x. The 
proportion of special integers in every interval of length N need not 
be the same as that in a particular interval of (much shorter) length v.f 
Indeed, S(vx) = 0, and so in the particular interval (1, væ) the propor- 
tion is 0. We observe that the proportion in the interval (N-x, N) 
is again about 1/log x, and that in the interval (N- vz, N) is again 0. 

Next we evaluate the number of pairs n, n+2 of special integers for 
which n <N. If n and »+2 are both special, we must have 


n =1 (mod 2), n = 2(mod 3) 
and n =1,2,3,...,p-3, orp-1 (modp) (3 < p < vz). 
The number of different possible residues for n (mod N) is therefore 


(p-2) = NT] (I-F)= NB 
8<p<vir 

(say) and this is the number of special pairs n, n+2 with n<N. 

Thus the proportion of special pairs in the interval (1, N) is B,(x) 
and the same is clearly true in any interval of rN consecutive integers. 
In the smaller interval (1, x), however, the proportion of special integers 
is about łe” times the proportion in the longer intervals. We may 
therefore expect (and it is here only that we ‘expect? and cannot prove) 
that the proportion of special pairs n, n+ 2 in the interval (1, x) is 
about, (4e”)? times the proportion in the longer intervals. But the special 
pairs in the interval (1, x) are the prime pairs p, p+ 2 in the interval 
(vx, x). Hence we should expect that 

Py(x)— Pj vr) ~ 4e%2B,(2). 


+ Considerations of this kind explain why the usual ‘probability’ arguments lead to 
the wrong asymptotic value for n(r). 


3K p< vz 
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2e-Y 
By Theorem 429, B(x) ~ ioga 
1 | B(x) 
and so łe B(x) ~ logs Be 
—2/p) _ P(P 
But Bix) _ ql 20; 
{B Ilan 1—1/p)? =? TV fe-F ; 


as x > œ. Since P,(vx) = O(vx), we have finally the result (22.20.1). 


NOTES ON CHAPTER XXII 


§§ 22.1, 2, and 4. The theorems of these sections are essentially Tchebychef’s. 
Theorem 416 was found independently by de Polignac. Theorem 415is an improve- 
ment of a result of Tchebychef’s; the proof we give here is due to Erdés and Kalmar. 

There is f\]] information about the history of the theory of primes in Dickson’s 
History (i, ch. xviii), in Ingham’s tract (introduction and ch. i), andin Landau’s 
Hundbuch (8-102 and 883-5); and we do not give detailed reférences. 

There is also an elaborate account of the early history of the theory in Torelli, 
Sulla totalità dei numeri primi, Atti della R. Acad, di Napoli (2) 11 (1902), 1-222; 
and shorter ones in the introductions to Glaisher’s Facto-r table for the sixth million 
(London, 1883) and Lehmer’s table referred to in the note on § 1.4. 

§ 22.3. ‘Bertrand’s postulate’ is that, for every n > 3, there is a prime p satis- 
fying n < p < 2n—2, Bertrand verified this for n < 3,000,000 and Tchebychef 
proved it for all n > 3 in 1850. Our Theorem 418 states a little less but the proof 
could be modified to prove the better result. Our proof is due to Erdés, Acta Litt, 
Ac. Sci. (Szeged), 5 (1932), 1948. 

For Theorem 419, see L. Moser, Math. Mag. 23 (1950), 163-4. See also Mills, 
Bull. American Math. Soc. 53 (1947), 604; Bang, Norsk, Mat. Tidsskr, 34 (1952), 
117-18; and Wright, American Math. Monthly, 58 (1951), 616-18 and 59 (1952), 99 
and Journal London Math. Soc. 29 (1954), 63-71. 

§ 22.7. Euler proved in 1737 that $ p™> and [J (1-p-l) are divergent. 

§ 22.8. For Theorem 429 see Mertens, Journal fir Math. 78 (1874), 46-62. For 
another proof (given in the first two editions of this book) see Hardy, Journal 
London Math. Soc. 10 (1935), 91-94. 

§ 22.10. Theorem 430 is stated, in a rather more precise form, by Hardy and 
Ramanujan, Quarterly Journal of Math. 48 (1917), 76-92 (no. 35 of Ramanujan’s 
Collected papers). It may be older, but we cannot give any reference. 

§§ 22.11-13. These theorems were first proved by Hardy and Ramanujan in 
the paper referred to in the preceding note. The proof given here is due to Turan, 
Journal London Math. Soc. 9 (1934), 274-6, except for a simplification suggested 
to us by Mr. Marshal] Hall. 

Turan [ibid. 11 (1936), 125-33] has generalized the theorems in two directions. 

§§ 22.14-16. A. Selberg gives his theorem in the forms 


Hx)log x + So a(z Jlogp = 2xlogx+O(zx) 
paa 
and poy RPT logplogp’ = 2wlogs+ O(a). 


pp Sa 
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These may be deduced without difficulty from Theorem 433. There are two 
essentially different methods by which the Prime Number Theorem may be 
deduced from Selberg’s theorem. For the first, due to Erdős and Selberg jointly, 
see Proc, Nat. Acad, Sci. 35 (1949), 374-84 and for the second, due to Selberg 
alone, see Annals of Math, 50 (1949), 305-13. Both methods are more ‘elementary’ 
(in the logical sense) than the Oneé we give, since they avoid the use of the integral 
calculus at the cost of a little complication of detail. The method which we use 
in §§ 22.15 and 16 is based essentially on Selberg’s own method. For the use of 
y(x) instead of fx), the introduction of the integral caleulus and other minor 
changes, gee Wright, Proc. Roy. Soc. Edinburgh, 63 (1951), 257-67. 

For an alternative exposition of the elementary proof of Theorem 6, gee van der 
Corput, Colloques sur la théorie des nombres (Liège 1956). See Errera (ibid. 
111-18) for the ghortest (non-elementary) proof. The same volume (pp. 9-66) 
contains a reprint of the original paper in whioh de la Vallée Poussin (contem- 
poraneously with Hadamard, but independently) gave the first proof (1896). 

For an alternative to the work of $ 22.15, see V. Nevanlinne, Soc. Sci. Fennica: 
Comm. Phys. Math. 27/3 (1962), 1-7. The same author (Ann. Acad, Soi. Fennicae 
A 1343 (1964), 1-52) gives a comparative account of the various elementary 
proofs. 

§ 22.18. Landau proved Theorem 437 in 1900 and found more detailed asymp 
totic expansions for 7,(%) and r(%) in 1911. Subsequently Shah (1933) and 
S. Selberg (1940) obtained results of the latter type by more elementary Means. 
For our proof and references to the literature, see Wright, Proc, Edinburgh Math. 
Soc. 9 (1954), 87-90. 

$22.20. This type of argument can be applied to obtain similar conjectural 
asymptotic formulae for the number of prime-triplets and of longer blocks of 
primes. These formulae agree very closely with the results of counts. They 
were found by a different method by Hardy and Littlewood [Acta Math. 44 
(1923), 1-70 (43)], who give references to work by Staeckel and others. See also 
Cherwell, Quarterly Journal of Math. (Oxford), 17 (1946), 46-62, for another 
simple heuristic method. 

The ideas in this section had their origin in correspondence and conversation 
with the late Lord Cherwell. See Cherwell and Wright, Quart. J. of Math, 11 
(1960), 60-63, for a fuller account. See also Polya, Amer. Math. Monthly 66 
(1959), 375-84, 

The formulae agree very well with the results of counts. D. H. and E. Lehmer 
have carried these out (on the SWAC computer) for various prime pairs, triplets, 
and quadruplets up to 40 million ; and the resulting tables have been deposited 
in the Unpublished Math. Tables file of Math. tables and other aids to computa- 
tion. Leech has carried out similar counts (on EDSAC), including certain quintu- 
plets and sextuplets, up to 10 million. 


XXIII 
KRONECKERS THEOREM 


23.1. Kronecker’s theorem in one dimension. Dirichlet’s 
Theorem 201 asserts that, given any set of real numbers #,, Òg. Oks 
we can make nÒ, nDo... nd, all differ from integers by as little as we 
please. This chapter is occupied by the study of a famous theorem 
of Kronecker which has the same general character as this theorem of 
Dirichlet but lies considerably deeper. The theorem is stated, in its 
general form, in § 23.4, and proved, by three different methods, in 
§§ 23.7-9. For the moment we consider only the simplest case, in which 
we are concerned with a single #. 

Suppose that we are given two numbers 6 and a, Can we find an 
integer n fOr which pee 
ts nearly an integer? The problem reduces to the simplest case of 
Dirichlet’s problem when q = 0. 

It is obvious at once that the answer is no longer unrestrictedly 
affirmative. If 6 is a rational number a/b, in its lowest terms, then 
(n#) = nd—[nd] has always one of the values 


1 2 b-1 
(23.1.1) 0, D D song “> 
If 0 < «<1, and gis not one of (23.1.1), then 


(r = 0, 1,..., b) 


r 
~— a 


b 


has a positive minimum yp, and n#—a cannot differ from an integer by 
less than p. 

Plainly p <1/2b, and p > 0 when b - 00; and this suggests the truth 
of the theorem which follows. 


THEOREM 438. If is irrational, a is arbitrary, and N and € are posi- 
tive, then there are integers n and p such that n > N and 


(23.1.2) jnd—p—al < €. 


We can state the substance of the theorem more picturesquely by 
using the language of § 9.10. It asserts that there are n for which (nð) 
is as near as we please to any number in (0, 1), or, in other words, 
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Tueorem 439. If is irrational, then the set of points (n®) is dense in 
the interval (0, 1).¢ 


Either of Theorems 438 and 439 may be called ‘Kronecker’s theorem 
in one dimension’. 


23.2. Praofs of the one-dimensional theorem. Theorems 438 
and 439 are easy, but we give several proofs, to illustrate different ideas 
important in this field of arithmetic. Some-of our arguments are, and 
some are not, extensible to space of more dimensions. 

(i) By Theorem 201, with ķ = 1, there are integers n, and p such 
that |n, 2—p| <€. The point (n, ®) is therefore within a distance «e of 
either 0 or 1. The series of points 

(n, 3), (22,9), (3n, 9), ..., 
continued so long as may be necessary, mark a chain (in one direction 
or the other) across the interval (0,1) whose meshf{ is less than e . 
There is therefore a point (kn, ®) or (n?) within a distance «€ of any CY 
of (0, 1). 

(ii) We can restate (i) SO as to avoid an appeal to Theorem 201, and 
we do this explicitly because the proof resulting will be the model of 
our first proof in space of several dimensions. 

We have to prove the set S of points P, or (n@), with n = 1, 2, 3,..., 
dense in (0,1). Since # is irrational, no point falls at 0, and no two 
points coincide. The set has therefore a limit point, and there are pairs 
(P, Piir), with r > 0, and indeed with arbitrarily large r, as near to 
one another as we please. 

We call the directed stretch P, P,,, a vector. If we mark off a stretch 
P, Q, equal to P, P,,, and in the same direction, from any Pm, then Q 
is another point of S, and in fact P,,,,. It is to be understood, when we 
make this construction, that if the stretch P,, Q would extend beyond 
0 or 1, then the part of it so extending is to be replaced by a congruent 
part measured from the other end 1 or 0 of the interval (0, 1). 

There are vectors of length less than e, and such vectors, with r > N, 
extending from any point of S and in particular from P,. If we measure 
off such a vector repeatedly, starting from P, we obtain a chain of 
points with the same properties as the chain of (i), and can complete 
the proof in the same way. 


+ We may seem to have lost something when we state the theorem thus (viz. the 
inequality ņ > N). But it is plain that, if there are points of the set as near as we 
please to every q of (0, 1), then among these points there are points for which Mis a8 
large as we please. 

t The distance between consecutive points of the chain. 
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(iii) There is another interesting ‘geometrical’ proof which cannot be 
extended, easily at any rate, to space of many dimensions. 

We represent the real numbers, as in § 3.8, on a circle of unit cireum- 
ference instead of on a straight line. This representation automatically 
rejects integers; 0 and 1 are represented by the game point of the circle 
and SO, generally, are (n) and nd. 

To say that S is dense on the circle is to say that every œ belongs to 
the derived set S. If a belongs to S but not to §’, there is an interval 
round « free from points of S, except for « itself, and therefore there 
are points near g belonging neither to S nor to 9’. It is therefore suffi- 
cient to prove that every «a belongs either to S or to S. 

If « belongs neither to S nor to 8’, there is an interval (a—4, «2+8’), 
with positive 6 and 6’, which contains no point of S inside if; and among 
all such intervals there is a greatest.t We call this maximum interval 
I(x) the excluded interval of a, 

It is plain that, if œ is surrounded by an excluded interval J(x), then 
a— is surrounded by a congruent excluded interval J(a—#), We thus 
define an infinite series of intervals 


Ka), I(a—B), [(a—28),.. . 


similarly disposed about the points «, ~—0, «— 2,.... No two of these 
intervals can coincide, since 6 is irrational; and no two can overlap, since 
two overlapping intervals would constitute together a larger interval, 
free from points of S, about one of the points. This is a contradiction, 
since the circumference cannot contain an infinity of non-overlapping 
intervals of equal length. The contradiction shows that there can be 
no interval (x), and so proves the theorem. 

(iv) Kronecker’s own proof is rather more sophisticated, but proves 
a good deal more. It proves 


THEOREM 440. If # is irrational, x is arbitrary, and N positive, then 
there isan n > Nanda p for which 


jnd—p—al < 3, 
n 


It will be observed that this theorem, unlike Theorem 438, gives a 
definite bound for the ‘error’ in terms of n, of the same kind (though 
not SO precise) as those given by Theorems 183 and 193 when « = 0. 

+ We leave the formal proof, which depends upon the construction of ‘Dedekind 


sections’ of the possible values of § and 8’, and is of a type familiar in elementary 
analysis, to the reader, 
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By Theorem 193 there are coprime integers q> 2N and r such that 


(23.2.1) go—rl< 3 

Suppose that Q is the integer, or one of the two integers, such that 
0322) ge—Q| <4. 

We can express Q in the form 

(23.2.3) Q = vr-uq, 

where œ and y are integers and 

(23.24) l < 4g. 

Then q(vI—-u—a) = v(g9—r)—(qa—9), 

and therefore 

(23.2.5) Ig(vdi—u—a)| <44. at =I, 


by (23.2.1), (23.2.2), and (23.2.4). If now we write 


N= q+, P= r+u, 
then 


(23.26) N<to<cn< fg 
and [n8—p—al < |vI—w—al-+ |go—r| < z4 


by (23.2.1), (23.2.5), and (23.2.6). 

It is possible to refine upon the 3 of the theorem, but not, by this 
method, in a very interesting way. We return to this question in 
Ch. XXIV. 


23.3. The problem of the reflected ray. Before we pass to the 
general proof of Kronecker’s theorem, we shall apply the special case 
already proved to a simple but entertaining problem of plane geometry 
solved by König and Szücs. 

The sides of a square are reflecting mirrors, A ray of light leaves a 
point inside the square and is reflected repeatedly in the mirrors. What 
is the nature of its path ?t+ 


THEOREM 441. Hither the path is closed and periodic or it is dense 
in the square, passing arbitrarily near to every point d the square, A 
necessary and sufficient condition FOF periodicity is that the angle between 
a side d the square and the initial direction d the ray should havea rational 
tangent. 


+ It may happen exceptionally that the ray passes through a corner of the square, 
In this case we assume that it returns along its former path. This is the oonvention 
suggested by considerations of continuity. 


223 
ae 
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In Fig. 10 the parallels to the axes are the lines 
c= 1+4, Y= m++4, 


where { and m are integers. The thick square, of side 1, round the 
origin is the square of the problem and P, or (a, b), is the starting-point. 
We construct all images of P in the mirrors, for direct or repeated 


Fre. 10. 


reflection. A moment’s thought will show that they are of four types, 
the coordinates of the images of the different types being 

(A) a+21, b+2m; (B) a+-21, —b+2m-+1; 

(C) —a+21+1, 642m; (D) —a+2I+1, —b+2m+1; 
where ? and m are arbitrary integers.t Further, if the velocity at P has 
direction cosines A, ys, then the corresponding images of the velocity 
have direction cosines 


We may suppose, on grounds of symmetry, that pu is positive. 


t The x-coordinate takes dl values derived from g by the repeated use of the substi- 
tutions gy’ = 1 —y and yx’ = = 1 —z, The figure shows the images corresponding to 


non-negative 1 and m, 
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If we think of the plane as divided into squares of unit side, the 
interior of a typical square being 


(23.3.1) l-$<au< l+}, m—-}< y< m+, 
then each square contains just one image of every point in the original 
square spessi; syeh 


and, if the image in (23.3.1) of any point in the original square is of 
type A, B, C, or D, then the image in (23.3.1) of any other point in the 
original square is of the same type. 

We now imagine P moving with the ray. When P meets a mirror 
at Q, it coincides with an image; and the image of P which momentarily 
coincides with P continues the motion of P, in its original direction, in 
one of the squares adjacent to the fundamental square. We follow 
the motion of the image, in this square, until it in its turn meets a side 
of the square. It is plain that the original path of P will be continued 
indefinitely in the same line L, by a series of different images. 

The segment of L in any square (23.3.1) is the image of a straight 
portion of the path of P in the original square. There is a one-to- 
one correspondence between the segments of L, in different squares 
(23.3.1), and the portions of the path of P between successive reflec- 
tions, each segment of L being an image of the corresponding portion 
of the path of P. 

The path of P in the original square will be periodic if P returns 
to its original position moving in the same direction; and this will 
happen if and only if L passes through an image of type A of the 
original P. The coordinates of an arbitrary point of L are 


x= a+àt, y= b+ pt. 
Hence the path will be periodic if and only if 
M=21, pt=2m 
for some ¢t and integral 1, m; i.e. if A/u is rational. 
It remains to show that, when A/u is irrational, the path of P 
approaches arbitrarily near to every point (£, n) of the square. It is 
necessary and sufficient for this that L should pass arbitrarily near to 


some image of (£, 7) and sufficient that it should pass near some image 
of (£, 9) of type A, and this will be so if 


for every é and 7, any positive €, some positive t, and appropriate 
integral l and m. 
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We take t= es 


when the second of (23.3.2) is satisfied automatically. The first in- 
equality then becomes 


(23.8.3) |md—w—l| < fe, 


) 


where O=-, w= (ano tase: 
H 2u 


Theorem 438 shows that, when # is irrational, there are 1 and m, large 
enough to make +t positive, which satisfy (23.3.3). 


23.4. Statement of the general theorem. We pass to the general 
problem in space of k dimensions. The num- 
bers ,, P». 3, are given, and we wish to 
approximate to an arbitrary set of numbers 
Ap Ag Ap, Integers apart, by equal mul- 
tiples of 3,, Ò... 6,. It is plain, after § 23.1, 
that the # must be irrational, but this con- 
dition is not a sufficient condition for the 
possibility of the approximation. 

Suppose for example, to fix our ideas, that 


k = 2, that ð, $, a, B are positive and less Fig, 11. 
than 1, and that and ¢ (whether rational or irrational) satisfy a 
relation ad-+bd-+e = 0 
with integral a, b, c. Then 
a.nd+b.nd 
and a(n?) -+-b(nd) 


are integers, and the point whose coordinates are (ni) and (nd) lies on 
one or other of a finite number of straight lines. Thus Fig. 11 shows 
the case a = 2, 6 = 3, when the point lies on one or other of the lines 
27+3y = v (v == I, 2,3, 4). It is plain that, if (x, 8) does not lie on one 
of these lines, it is impossible to approximate to it with more than a 
certain accuracy. 

We shall say that a set of numbers 


Ep Easy Ep 


is linearly independent if no linear relation 


aE ta fat. Ha, E, = 9, 
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with integral coefficients, not all zero, holds between them. Thus, if 
Py; Poy) P, ae different primes, then 


log p,, log pa, +; log p, 
are linearly independent; for 
a, log p, +a, log p,+...+a,logp, = 0 
is PiP? pr = 1, 
which contradicts the fundamental theorem of arithmetic. 

We now state Kronecker’s theorem in its general form. 

TREOREM 442. If Dry Bo, aves Fy, l 
are linearly independent, 04, %2 p», œp are arbitrary, and N and « are 
positive, then there are integers 

n>N, Pr Pos os Py 
such that nEn —Pm— am < € (m=1,2,..,k). 

We can also state the theorem in a form corresponding to Theorem 
439, but for this we must extend the definitions of § 9.10 to k-dimen- 
sional space. 

If the coordinates of a point P of k-dimensional space are £j, Xares Ly, 
and § is positive, then the set of points 24, 2},..., 2, for which 

[EnEn] < ò (m = 1, 2.052) 
is called a neighbourhood of P. The phrases limit point, derivative, closed, 


dense in itself, and perfect are then defined exactly as in $9.10. Finally, 
if we describe the set defined by 


0<a,<1 (m=1,2,..., k) 
as the ‘unit cube’, then a set of points S is dense in the unit cube if every 
point of the cube is a point of the derived set ©. 
THEOREM 443. If Òi, Oy ,..., Op, 1 are linearly independent, then the set 


of points (n9,), (m9), . (nd) 
4g dense in the unit cube. 


23.5. The two forms of the theorem. There is an alternative 
form of Kronecker’s theorem in which both hypothesis and conclusion 
assert a little less. 

Teorem 444. If 9,, 9, ,..., 5, are linearly independent, oy, og su, Oy 
are arbitrary, and T and « are positive, then there is a real number t, and 
integers P,, Dg y+) Pr Such that 

t>T 


and tôn —Pm— am < € M=1, 2 pa k. 
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The. fundamental hypothesis in Theorem 444 is weaker than in 
Theorem 442, since it only concerns linear relations homogeneous in 
the 6. Thus 3, = V2, 6, = 1 satisfy the condition of Theorem 444 but 
not that of Theorem 442; and, in Theorem 444, just one of the # may 
be rational. The conclusion is also weaker, because ¢ is not necessarily 
integral. 

It is easy to prove that the two theorems are equivalent. It is useful 
to have both forms, since some proofs lead most naturally to one form 
and some to the other. 

(1) Theorem 444 implies Theorem 442. We suppose, as we may, that 
every # lies in (0,1) and that e < 1. We apply Theorem 444, with k+ 1 
for k, N+ 1 for T, and 4e for e, to the systems 

By, Oy, vy Drs l; Oy) Ape Myr O. 
The hypothesis of linear independence is then that of Theorem 442; and 
the conclusion is expressed by 


(23.5.1) t> N41, 
(23.5.2) |t9,,—Pm—m| < te (m=1,2,...,2), 
(23.5.3) er eer 


From (23.5.1) and (23.5.3) it follows that p,,, > N, and from (23.5.2) 
and (23.5.3) that 
Pet Om —Pm— ml < tn —Pim—%m| + li— Pkl <e 

These are the conclusions of Theorem 442, with n = Pyy 

(2) Theorem 442 implies Theorem 444. We now deduce Theorem 444 
from Theorem 442. We observe first that Kronecker’s theorém (in 
either form) is ‘additive in the or’; if the result is true for a set of # 
and for o;,..., Œp, and also for the same Set of 6 and for ,,.-., Bp then it 
is true for the same 6 and for a,+f,,.. +5 0,-+f,. For ifthe differences of 
po from a, and of g from $, are nearly integers, then the difference 
of (p+q)8 from a+ is nearly an integer. 

If ĝi 0,..., ,4; are linearly independent, then so are 


D Dr 
Dea Peat’ 
We apply Theorem 442, with N = T, to the system 
a bp 
a 5 yy. ny Ope 
Gia b 1 KAR 
There are integers n > N, Pis», Py such that 
(23.5.4) LE E EE E A By. 
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If we take t = n/,,,, then the inequalities (23.5.4) are k of those 
required, and i a—n|=0 <e. 
Also t >n > N = T. We thus obtain Theorem 444, for 
By en Bhs Fears yy ory Oy O, 
We can prove it similarly for 
Drs vey Pes Perri O, oy O, Oar, 
and the full theorem then follows from the remark at the beginning of (2). 


23.6. An illustration. Kronecker’s theorem is one of those mathematical 
theorems which agsert, roughly, that ‘what is not impossible will happen some- 
times however improbable it may be’. We can illustrate this ‘astronomically’. 

Suppose that k spherical planets revolve round a point 0 in concentric €O- 
planar circles, their angular velocities being 27w,, 2rrwo,..., 27w,, that there is 
an observer at 0, and that the apparent diameter of the inmost planet P, observed 
from 0, is greater than that of any outer planet. 

If the planets are all in conjunction at time { = 0 (so that P occults all the 
other planets), then their angular coordinates at time ¢ are 2rter,... . Theorem 201 
shows that we can choose a t, as large as we please, for which all these angles are 
as near as we please to integral multiples of 27, Hence occultation of the whole 
system by P will recur continually. This conclusion holds for qi] angular velo- 
cities, 

If the angular coordinates are initially œj, a ,..., oz, then such an occultation may 
never occur. For example, two of the planets might be originally in opposition 
and have equal angular velocities. Suppose, however, that the angular velocities 
are linearly independent. Then Theorem 444 shows that, for appropriate i, as large 


as we please, all of 2rtw ta... Zertwyt ay 


will be as near as we please to multiples of 2y; and then occultations will recur 
whatever the initial positions. 

23.7. Lettenmeyer’s proof of the theorem. We now suppose 
that k = 2, and prove Kronecker’s theorem in this case by a ‘geo- 
metrical’ method due to Lettenmeyer. When k = 1, Lettenmeyer’s 
argument reduces to that used in § 23.2 (ii). 

We take the first form of the theorem, and write 3, ¢ for #,, Pa. We 


may suppose 0<8<1, 0<¢<1; 
and we have to show that if #, $, 1 are linearly independent then the 
points P, whose coordinates are 

(nd), (nd) (n = 1,2,...) 
are dense in the unit square. No two P, coincide, and no P, lies on a 


side of the square. 
We call the directed stretch 


Pp Prier (n > 0, r > 0) 
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a vector. If we take any point P,, and draw a vector Pp, Q equal and 
parallel to the vector P, P,,,, then the other end Q of this vector is a 
point of the set (and in fact P,,,,). Here naturally we adopt the con- 
vention corresponding to that of § 23.2 (ii), viz. that, if P,Q meets a 
side of the square, then it is continued in the same direction from the 
corresponding point on the opposite side of the square. 

Since no two points P,, coincide, the set (P,) has a’limit point; there 
are therefore vectors whose length is less than any positive €, and vectors 
of this kind for which r is as large as we please. We call these vectors 
e-vectors. There are ¢-vectors, and ¢-vectors with arbitrarily large r, 
issuing from every P,, and in particular from P. If 


e < min(9, 4, 1—8, 1-4), 


then all e-vectors issuing from P, are unbroken, i.e. do not meet a side 
of the square. 

Two cases are possible a priori. 

(1) There are two e-vectors which are not parallel.t In this case we 
mark them off from P, and construct the lattice based upon P, and the 
two other ends of the vectors. Every point of the square is then within 
a distance e¢ of some lattice point, and the theorem follows. 

(2) All ¢-vectors are parallel. In this case all e-vectors issuing from 
P, lie along the same straight line, and there are points P,, F, on this 
line with arbitrarily large suffixes r,s. Since P,, P, P, are collinear, 


2 ¢ 1 8 $ 1l 
0=]|(r) (rẹ) i| = |rd—[rd}] rġ—[rġ] 1 
(s) (s) 1; [|s8—[sð] sġ—[sġ] 1 
a ¢ | 
and so [r?] [rp] r- 1 = o, 
[s9] [s$] s-1 
or af+bġ+c = 0, 


where a, b, ¢ are integers. But 6, ¢, 1 are linearly independent, and 
therefore a, b, care all zero. Hence, in particular, 


fe] r—1] _ 
[s¢] =e 
. lg) _ Egl, 


s—l r—l 


, 


t In the sense of elementary geometry, where we do not distinguish two directions 
on one straight line. 
5591 cc 
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We can make s > 00, since there are F, with arbitrarily large s; and we 


then obtain 
j= lim 24) _ (rd) 
s—l1 r-l’ 
which is impossible because ¢ is irrational. 
It follows that case (2) is impossible, SO that the theorem is proved. 


23.8. Estermann’s proof of the theorem. Lettenmeyer’s argu- 
ment may be extended to space of k dimensions, and leads to a general 
proof of Kronecker’s theorem; but the ideas which underlie it are illus- 
trated adequately in the two-dimensional case. In this and the next 
section we prove the general theorem by two other quite different 
methods. 

Estermann’s proof is inductive. His argument shows that the theorem 
is true in space of k dimensions if it is true in space of k- 1. It also 
shows incidentally that the theorem is true in one-dimensional space, 
so that the proof is self-contained; but this we have proved already, 
and the reader may, if he pleases, take it for granted. 

The theorem in its first form states that, if 9, d,,..., Òq, 1 are linearly 
independent, a, d,..., a, are arbitrary, and ¢ and w are positive, then 
there are integers n, P4, Pa. Py such that 


(23.8.1) n>w 
and 
(23.8.2) [29,-—Pm—%m| <€ (M =1, 2, k). 


Here the emphasis is on large positive values of n. It is convenient 
now to modify the enunciation a little, and consider both positive and 
negative values of n. We therefore assert a little more, viz. that, given 
a positive ¢ and w, and a À of either sign, then we can choose n and the 
p to satisfy (23.8.2) and 
(23.8.3) n| > w, signn = signd, 
the second equation meaning that n has the same sign as À. We have 
to show (a) that this is true for k if it is true for k- 1, and (6) that it is 
true when k = 1. 

There are, by Theorem 201, integers 

s> 0, bis bas... . by 


such that 
(23.8.4) \sd,,—b,,| < fe (M=1,2,., k). 
Since 3, is irrational, s%,—b, 4 0; and the k numbers 


23.81 KRONECKER’S THEOREM 387 


(of which the last is 1) are linearly independent, since a linear relation 
between them would involve one between &,,..., p 1. 

Suppose first that k œ> 1, and assume the truth of the theorem for 
k- 1. We apply the theorem, with k- 1 for k, to the system 


bys bo) = Pp- (for Fy, Fy...) Fey), 
By = Ak r By = Aappo en Bug = "Op — Pea 
(fOr ay, Ogres, %e—1)> 


4e (for e)  A(sd,—b,) (for A), 


(23.8.5) Q = (w+1)|st,—b,|+|a,| (for w). 
There are integers Cy, Ci, Cas... Cy_, Such that 

(23.8.6) lc] > Q, sign ¢, = sign {A(s —by)}, 
and 

(23.8.7) Cn Pm—Cm—Pm| < $€ (m= 1,2 ,.., kel), 


The inequality (23.8.7), when expressed in terms of the #, is 


CkrtHAk 
st. — bp 
Here we have included the value k of m, as we may do because the left- 
hand side of (23.8.8) vanishes when m = k. 

We have supposed k > 1. When k = 1, (23.8.8) is trivial, and we 
have only to choose ¢, to satisfy (23.8.6), as plainly we may. 

We now choose an integer N so that 


(23.8.8) (89,,—Om)—Cm—%Mm| < 4€ (m= 1, 2,...,k). 


Cta 
8. Nea l, 
(23.8.9) | EN < 
and take n = Ns, Pm = NOn+Em: 
Then 
[299 —Pm— Xm = IN (sùm —Om) — Cin — | 
< ae ($9, — Om) — Cm — %n + [sm bml 


< e+ łe = € (m = V3.2 pases k), 
by (23.8.4), (23.8.8), and (23.8.9). This is (23.8.2). Next 


Crt Oy 1p. |— lol 
sj,—b,| ~ |sd4,—6,| 


by (23.8.5) and (23.8.6); so that |V| > w and 
In| = [Nis > IN| > w. 


(23.8.10) >o+l, 
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Finally, n has the sign of N, and so, after (23.8.9) and (23.8.10), the 
sign of 
Ck 
stp — bg" 
This, by (23.8.6), is the sign of À. 
Hence n and the p satisfy all our demands, and the induction from 
k- 1 to k is established. 


23.9. Bohr’s proof of the theorem. There are also a number of 
‘analytical’ proofs of Kronecker’s theorem, of which perhaps the 
simplest is one due to Bohr. All such proofs depend on the facts that 

elt) = eriz 
has the period 1 and is equal to 1 if and only if x is an integer. 
We observe first that 


T 
T 
lim 1 | gol dt = lim Ct = 
T—o To at 
if c is real and not zero, and is 1 if c = 0. It follows that, if 
(23.9.1) x) = > b,et, 
v=1 
where no two ¢, are equal, then 
T 
(23.9.2) b, = lim 2 | x(tencv! at. 
Too 


We take the second form of Kronecker’s theorem (Theorem 444:), 
and consider the function 


(23.9.3) p(t) = F), 
where i 
(23.9.4) ra = 1+ ¥ et, t—a,,), 
m=1 
of the real variable ¢, Obviously 
b(t) < k1. 


If Kronecker’s theorem is true, we can find a large ¢ for which every 
term in the sum is nearly 1 and ¢(t) is nearly k+1. Conversely, if (¢) 
is nearly k+ 1 for some large t, then (since no term can exceed 1 in 
absolute value) every term must be nearly 1 and Kronecker’s theorem 
must be true. We shall therefore have proved Kronecker’s theorem if 
we can prove that 


(23.9.5) lim g(t) = k+1. 


t+ 
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The proof is based on certain formal relations between F(t) and the 
function 
(23.9.6) Ply, Eoee) = L4-ayo_ +... + ry 
of the k variables x. If we raise % to thepth power by the multinomial 
theorem, we obtain 
(23.9.7) pP _ > Onna yun MHL TY! oT 


Here the coefficients a are positive; their individual values are irrelevant, 
but their sum is 


(23.9.8) > a = £1, l, 1) = (k+ 1). 
We also require an upper bound for their number. There are p--1 of 
them when f = 1; and 
(1+a,+...+ 2%)? 
= (Late tty + i CETE e E +... +22, 


so that the number is multiplied at most by »-+1 when we pass from 
k-1 to k. Hence the number of the a does not exceed (p+1)*.f 
We now form the corresponding power 


FP = {1+e(9,t—a4)-+...-¢(9,t—a,)}? 
of F. This is a sum of the form (23.9.1), obtained by replacing x, in 
(23.9.7) by e(f,t—a,). When we do this, everyproduct aih...a%* in (23.9.7) 
will give riseto a different c, since the equality of two ¢, would imply 
a linear relation between the ð.} It follows that every coefficient 
b, has an absolute value equal to the corresponding coefficient a, and 


si S l= Za= (+p. 
Suppose now that, in contradiction to (23.9.5), 
(23.9.9) lim ¢(t) < k+1. 


Then there is a À and a tọ such that, for t> to 
[F(t)| <A < k+1, 


T T 
=— 1 MRAR i 
and mg | IFO di < img | di = XP. 
0 0 


¢ The actual number is (? ag \ 


f It is here only that we use the linear independence of the #, and this is naturally 
the kernel of the proof. 
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Hence 


lb,| = 


T 
— —cyit 
mi img | {F (t) pe-o dt 
0 


T 
<m; | |F(t)|? dt < X 
0 


and therefore a < A? for every a. Hence, since there are at most 
(p-+1)* of the a, we deduce 


(k+1)? = Ya < (p+1))?, 


(23.9.10) y < (p+1¥. 
But À < k+1, and so =Y- eP, 
where § > 0. Thus e < (p+), 


which is impossible for large p because 
ep4 1 >o 


when p -> œ. Hence (23.9.9) involves a contradiction for large p, and 
this proves the theorem. 


23.10. Uniform distribution. Kronecker’s theorem, important as 
it is, does not tell the full truth about the sets of points (n) or (nd,), 
(nd,),... with which it is concerned. These sets are not merely dense in 
the unit interval, or cube, but ‘uniformly distributed’. 

Returning for the moment to one dimension, we say that a set of 
points P, in (0,1) is uniformly distributed if, roughly, every sub-interval 
of (0,1) contains its proper quota of points. To put the definition pre- 
cisely, we suppose that J is a sub-interval of (0, 1), and use J both for 
the interval and for its length. If ny is the number of the points P, 
Pa, Pa which fall in 1, and 


ny 


(23.10.1) > ZT, 
n 


whatever 1, when n > œ, then the set is uniformly distributed. We 
can also write (23.10.1) in either of the forms 
(23.10.2) nmy~wnl, nN, = ni+o(n). 


THEOREM 445. If } is irrational then the points (n) are uniformly 
distributed in (0,1). 


We give a proof depending upon the simplest properties of continued 
fractions. We use the circular representation of § 23.2 (iii). 
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We choose a positive integer M so that 


(23.10.3) n= i < he < , 


and suppose that 
(23.10.4) GW S nn < V+ 


where the g, are the denominators of the convergents to #. When 7 is 
fixed, and n -> œ, then v -> % and q, ~> 00, and 


(23.105) M < fe 
q} 
for sufficiently large n. We write n in the form 
(23.10.6) n = 1,18; 
where r is a positive integer and 
(23.10.7) OSSY 
1 
Then = <f =r+> <i 
7 
and so 
1 n 
(23.10.8) M=-<r<—. 
K Iy 


We suppose that J is (q, 8), and define u and v as the integers such 
that 


u—l u Aer 
(23.10.9) <ag— << 
qy o ry <P qy 
v-u will be large when n and y are large. The points 
F (u+M <w <v—M) 
lie in the interval ate p=. 


which we call P. If a point P’ lies in T’, and the distance PP’ is less 
than M/q,, then P lies in I 
We now consider the points m#, or 


(23.10.10) m2, 
6, being the vth convergent to #. The first q, of these points are the 
points 5 1 a 7 q—! 


Gq a Hh 
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in another order. Of these points, v-u-2M+ 1 lie in I’; and therefore, 
since n > rq, at least 

(23.10.11) r(v—u—2M +1) 

of the first n points (23.10.10) lie in J’, 


Now Pps ie 
qi 
by (23.10.9), or v-u > gq, 1-2. 
Hence = r(v—u—2M+1) > r(q, 1—2M—1) > r(q, 1-3M) 
= nI—sI—3Mr. 
But sl Ss $< q, Syn žen, 
by (23.10.7), (23.10.4), and (23.10.3); and 
3Mr < 3Mn < ten, 


v 


by (23.10.8) and (23.10.5). It follows that the number of m#, in I’ for 
which m < nis greater than n(I—e). 

If m#, is one of these points, then 

n 1 M 

<—a— 

Ww ny A 
by Theorem 171, (23.10.4), and (23.10.3). Since m#, lies in the interval 
1’, mò lies in the interval J. Hence the number of m# in I for which 
m <n is greater than n( /—e); and therefore 


. n 
lim — > L-E. 
no 


But ¢ is arbitrary, and therefore 


(23.10.12) lim “>J, 
wo N 
Suppose finally that J is the complement of J, a single interval in 
the circular representation. Then the same argument shows that 


lim “J > J =1-1, 
feo N 
and therefore that 
(23.10.13) lim <I; 
noo n 


and (23.10.12) and (23.10.13) together contain the theorem. 
The definition of uniform distribution may be extended at once to 
space of k dimensions, and Kronecker’s general theorem may be 
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sharpened in the same way. But the proof is more difficult, and the 
argument which we have used in this section cannot be generalized. 

It is natural to inquire what happens in the exceptional cases when 
the ð are connected by one or more linear relations. Suppose, to fix our 
ideas, that k = 3. If there is one relation, the points P, are limited to 
certain planes, as they were limited to certain lines in § 23.4; if there 
are two, they are limited to lines. Analogy suggests that the distribu- 
tion on these planes or lines should be dense, and indeed uniform; and 
it can be proved that this is SO, and that the corresponding theorems 
in space of k dimensions are also true. 


NOTES ON CHAPTER XXIII 


23.1. Kronecker first stated and proved his theorem in the Berliner Sitzungs- 
berichte, 1884 [Werke, iii (i), 47-110]. Koksma’s book contoins an exhaustive 
bibliography of later work inspired by the theorem. The one-dimensional theorem 
seems to be due to Tchebychef: see Koksma, 76. 

§ 23.2. For proof (iii) see Hardy and Littlewood, Acta Math. 37 (1914), 155-91, 
especially 161-2. 

§ 23.3. König and Szücs, Rendiconti del circolo matematico di Palermo, 36 (1913), 
79-90. 

§ 23.7. Lettenmeyer, Proc, London Math. Soc. (2), 21 (1923), 306-14. 

§ 23.8. Estermann, Journal London Math. Soc. 8 (1933), 18-20. 

§ 23.9. H. Bohr, Journal London Math. Soc. 9 (1934), 5-6; for a variation see 
Proc. London Math. Soc. (2) 21 (1923), 315-16. There is another simple proof 
by Bohr and Jessen in Journal London Math. Soc. 7 (1932), 274-5. 

§ 23.10. Theorem 445 seems to have been found independently, at about the 
same time, by Bohl, Sierpiński, and Weyl. See Koksma, 92. 

The best proof of the theorem is no doubt that given by Weyl in a very im- 
portant paper in Math. Annalen, 77 (1916), 313-52. Weyl proves that a necessary 
and sufficient condition for the uniform distribution of the numbers 


(f(D), FD, FOD, = 
n 
in (0,1) is that È efhf(v)} = oln) 
v=1 
for every integral h, This principle has many important applications, particularly 
to the problems mentioned at the end of the chapter. 


XXIV 
GEOMETRY OF NUMBERS 


24.1. Introduction and restatement of the fundamental theo- 
rem. This chapter is an introduction to the ‘geometry of numbers’, 
the subject created by Minkowski on the basis of his fundamental 
Theorem 37 and its generalization in space of n dimensions. 

We shall need the n-dimensional generalizations of the notions which 
we used in §§3.9-11; but these, as we said in § 3.11, are straightforward. 
We define a lattice, and equivalence of lattices, as in § 3.5, parallelo- 
grams being replaced by n-dimensional parallelepipeds; and a convex 
region as in the first definition of § 3.9.t Minkowski’s theorem is then 


THEOREM 446. Any convex region in n-dimensional space, symmetrical 
about the origin and of volume greater than 2", contains a point with 
integral coordinates, not all zero. 


Any of the proofs of Theorem 37 in Ch. III may be adapted to prove 
Theorem 446: we take, for example, Mordell’s. The planes 


x, = 2p,/t (r = 1,2,...,2) 
divide space into cubes of volume (2/t)", If M(t) is the number of 


corners of these cubes in the region R under consideration, and V the 
volume of R, then (IDN (t) >v 


when ¢ + œ; and N(t) > t” if V > 2" and ¢ is sufficiently large. The 
proof may then be completed as before. 
If &, Ea.. Én ave linear forms in £4, 2g...) X,, SAY 


(24.1.1) E = py By +p t. bony (= 1, 2,04), 


with real coefficients and determinant 


Oy A2 © © + On 
(24.1.2) AS aa . . . . «. .« [#¢9, 


Ont Ome + s + Ay 


then the points in £-space corresponding to integral £i, %,..., Ty form 
a lattice At: we call A the determinant of the lattice. A region R of 


{ The second definition can also be adapted to ņ dimensions, the line 1 becoming an 
(n- 1)-dimension81 ‘plane’ (whereas the line of the first definition remains a ‘line’). We 
shal] use three-dimensional language: thus we shall çal] the region |x, | <1, EA <],..., 
EA < 1 the ‘unit cube’. 

t In §3.5 we used L for a lattice of lines, A for the corresponding point-lattice. It 
is more convenient now to reserve Greek letters for configurations in ‘f-space’. 
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x-space is transformed into a region P of £-space, and a convex R into 
a convex P.t Also 


ff m f dé by... dE „=A f 1. f dx, de. dp, 


so that the volume of P is |A| times that of R. We can therefore restate 
Theorem 446 in the form 


Turornem 447. If A is a lattice of determinant A, and P is a convex 
region symmetrical about 0 and of volume greater than 2" |A], then P 
contains a point of A other than 0. 


We assume throughout the chapter that A =4 0. 


24.2. Simple applications. The theorems which follow will all 
have the same character. We shall be given a system of forms &,, 
usually linear and homogeneous, but sometimes (as in Theorem 455) 
non-homogeneous, and we shall prove that there are integral values of 
the x, (usually not all 0) for which the é, satisfy certain inequalities. 
We can obtain such theorems at once by applying Theorem 447 to 
various simple regions P. 

(1) Suppose first that P is the region defined by 


[él < Àr léal < A2 En < Àn: 
This is convex and symmetrical about 0, and its volume is 2"A, Àg.. Ap: 
If À Ag... A, > |â], P contains a lattice point other than 0; if 
X,X,... A, > [A], there is a lattice point, other than 0, inside P or on 
its boundary.t We thus obtain 


Torm 448. If £, z-o En are homogeneous linear forms in 2, 
Xo sey Ly, with real coefficients and determinant A, dy, Àg sees Àn are positive, 
and 


(24.2.1) AyAg..A, => JA], 
then there are integers %,, Ba Ln, not all 0, for which 
(24.2.2) lé] < Ags [eal S Anees Enl S Aw 


In particular we can make |é |< 4 |A | for each r. 


+ The invariance of convexity depends on two properties of linear transformations 
viz. (1) that lines and planes are transformed into lineg and planes, and (2) that the 
order of points on a line is unaltered. 

t We pass here by an appeal to continuity from a result concerning an open region 
to one concerning the corresponding closed region. We might, of course, make a similar 
change in the general theorems 446 and 447: thus any closed convex region, symmetrical 
about 0, and of volume not less than 2", has a lattice point, other than 0, inside it or 
on its boundary. We shall not again refer exphcitly to such trivial appeals to continuity. 
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(2) Secondly, suppose that P is defined by 
(24.2.3) ilH lalt. lnl <A. 


If n = 2, P is a square; if n = 3, an octahedron. In the general case 
it consists of 2” congruent parts, one in each ‘octant’. It is obviously 
symmetrical about 0, and it is convex because 


éte i| S plete lE] 


for positive u and f The volume in the positive octant é, > 0 is 
1-& 1~£,—...- En jn 
w fae | dé, ... J dé, = 


If A" > n! [A| then ae volume of P exceeds 2”jA], and there is a lattice 
point, besides 0, in P. Hence we obtain 
THEOREM 449. There are integers t, čo., Ens not all 0, for which 


(24.2.4) ilé lén] < (nt [AL 


Since, by the theorem of the arithmetic and geometric means, 


n|é Esa Enl” < lé l+ lal +. + Sails 


we have also 

THEOREM 460. There are integers x}, Xp,.... Xp, not all 0, for which 
(24.2.5) lé, £5. En] <n n! [Al 

(3) As a third application, we define P by 

E+E +. +E <M: 
this region is convex because 
(éte E) < (ute (pe? p's?) 

for positive u and p’. The volume of P is A"J,,, wheret 


I= fff derdéy dé, = 
4+ 8+..48<1 


min 
n+ 
Hence we obtain 
Torm 451. There are integers £, %q,..., Ep, Not all 0, for which 


(24.2.6) G++. +E < a(t). 


n 
Theorem 451 may be expressed in a different way. A quadratic form 
Q in £i, %,..., Zp is a function 


n n 
Q(t, Taren En) => È lrs Tr Ts 


t See, for exemple, Whittaker and Watson, Modern analysis, ed. 3 (1920). 268. 
For n = 2 and n = 3 We get the values À? and #7A‘ for the volumes of ẹ circle or 
a sphere. 
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with a,,= a,,. The determinant D of Q is the determinant of its 
coefficients. If Q > 0 for all %4, %,..., %,, not all 0, then Q is said to 
be positive definite. It is familiart that Q can then be expressed in the 


ame Q = G++... +E, 


where €,, Éa... én are linear forms with real coefficients and determinant 
vD, Hence Theorem 451 may be restated as 


TuEorEM 452. If Q is a positive definite quadratic form in £i, a,... , X, 
with determinant D, then there are integral values of 2, Eg., Lp, not all 
0, for which 


(24.2.7) Q < 4DingJz M, 


n! 


24.3. Arithmetical proof of Theorem 448. There are various 
proofs of Theorem 448 which do not depend on Theorem 446, and the 
great importance of the theorem makes it desirable to give one here. 
We confine ourselves for simplicity to the case n = 2. Thus we are 
given linear forms 


(24.3.1) E = ax+ßy, n= y+ òy, 
with real coefficients and determinant A = ad—fy Æ 0, and positive 
numbers À, u for which Au > A |; and we have to prove that 
(24.3.2) EIKA fa] < p, 
for some integral x and y not both 0. We may plainly suppose A > 0. 

We prove the theorem in three stages: (1) when the coefficients are 
integral and each of the pairs a, and y, 6 is coprime; (2) when the 
coefficients are rational; and (3) in the general case. 

(1) We suppose first that «, 8, y, and ô are integers and that 

(a, B) = (y,8) = 1: 


Since (a, 8) == 1, there are integers p and q for which ag—Bp=1. The 
linear transformation 


ax-+py = X, pxetqy= Y 
establishes a (1,1) correlation between integral pairs x, y and X, Y; and 
éz X, y= rX-+AY, 
where r = yq—6p is an integer. It is sufficient to prove that |€| < A 


and |7|< p for some integral X and Y not both 0. 
If A <1 then p >A, and X = 0, Y = 1 gives £ = 0, In| =A S u. 


t See, for example, Bécher, Introduction to higher algebra, ch. 10, or Ferrar, Algebra, 
ch. 11. 
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If à> 1, we take 
r ya 
ba Seta CRS, kaxi 


in Theorem 36. Then 0<X<fAl<a 


Y A A 


: r E A 
and WX4A¥|= AX|—5—3[ < aR 


so that X = k and Y = h satisfy our requirements. 


(2) We suppose next that a, 8, y, and 6 are any rational numbers. 
Then we can choose pand g SO that 


E = pë = att py, 7 = on = y'a+dy, 
where a’, f’, y, and §’ are integers, (a’,B’) = 1, (y, 6’) = 1, and 


A’ = a'8’—p'y' = poA. Also ph. op > A’, and therefore, after (1), there 
are integers x, y, not both 0, for which 
ELSPA, il < op. 
These inequalities are equivalent to (24.3.2), so that the theorem is 
proved in case (2). 
(3) Finally, we suppose a, B, y, and 8 unrestricted, If we put 
= aNA pe E = ENA „a, then A’ = «’8’—f’y’ = 1. If the theorem has 
been proved when A = 1, and A'u’ > 1, then there are integral x, y, 
not both 0, for which 
ELSAN, In <p; 
and these inequalities are equivalent to (24.3.2), with À = AA, 
p= BNA, àu > A. We may therefore suppose without loss of generality 
that A = 1.t 
We can choose a sequence of rational sets œp, By, Yn» Òn Such that 


on on—Bn¥n = 1 
and a, > a, Bn > B,..., When n ~> œ, It follows from (2) that there are 
integers x, and y,, not both 0, for which 
(24.3.3) len tetBnYnl <A, — lyn®n+8nYnl < v- 
Also 
so that x, is bounded; and similarly y, is bounded. It follows, since 


tł The ¢ here is naturally not the £ of this section. 
ł A similar appeal to homogeneity woild enable us to reduce the proof of any of 
the theorems of this chapter to its proof in the cage in which A hag any assigned value. 
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x, and y, are integral, that some pair of integers x, y must occur 
infinitely often among the pairs x,, y,. Taking x, = x, y, = y in 
(24.3.3), and making n —> oo, through the appropriate values, we obtain 
(24.3.2). 


It is important to observe that this method of proof, by reduction to the case 
of rational or integral coefficients, cannot be used for such a theorem as Theorem 
450. This (when n = 2) asserts that |én| < 3/4 for appropriate x, y. If we try 
to use the argument of (3) above, it fails because x, and y, are not necessarily 
bounded. The failure is natural, since the theorem is trivial when the coefficients 
are rational: we can obviously choose 7 and y so that £ = 0, lén] =0< 4A]. 


24.4. Best possible inequalities. It is easy to see that Theorem 


448 is the best possible theorem of its kind, in the sense that it becomes 
false if (24.2.1) is replaced by 

(24.4.1) AzA ee An > EJA] 

with any k < 1. Thus if £, = x, for each r, SO that A = 1, and À, = Vk, 
then (24.4.1) is satisfiecl; but |£ | < A, < 1 implies x, = 0, and there is 


no solution of (24.2.2) except 7, = £, = = 0. 

It is natural to ask whether Theorems 449- 51 are similarly ‘best 
possible’. Except in one special case, the answer is negative; the 
numerical constants on the right of (24.2.4), (24.2.5), and (24.2.6) can 
be replacecl by smaller numbers. 

The special case referred to is the case n = 2 of Theorem 449. This 
asserts that we can make 


(24.4.2) Ella] < (21A), 


and it is easy to see that this is the best possible result. If = x+y, 
n = x-y, then A = -2, and (24.4.2) is |E€/+]n| < 2. But 


Ifl-+]y] = max(|é+q|, |[f—y|) = max(|22|, |2y)), 


and this cannot be less than 2 unless x = y = 0.f 
Theorem 450 is not a best possible theorem even when n = 2. It 


then asserts that 
(24.4.3) lén] < 414, 
and we shall show in § 24.6 that the 3 here may be replaced by the 
smaller constant 5-?, We shall also make a corresponding improvement 
in Theorem 451. This asserts (when n = 2) that 

Cty < < 4x Al, 
and we shall show that 47-1 = 1:27... may be replaced by (§)# = 1-15... 


+ Actually the cage n = 2 of Theorem 449 is equivalent to the corresponding case 
of Theorem 448. 
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We shall also show that 5-t and (4) are the best possible constants. 
When n> 2, the determination of the best possible constants is difficult. 


24.5. The best possible inequality for £242. If 
Q(x, y) = ax?-+ 2bzy+cy? 


is a quadratic form in x and y (with real, but not necessarily integral, 
coefficients); 


x= px +g, y = rx’+sy' (ps—qr = +1) 
is a unimodular substitution in the sense of § 3.6; and 
Q(x, y) = ax? 2b'a'y’+ cy? = Q'(2', y), 
then we say that Q is equivalent to Q’, and write Q ~ Q. It is easily 
verified that q'¢'—b'? = ac—b?, so that equivalent forms have the same 
determinant. It is plain that the assertions that |Q| < k for appro- 


priate integral x, y, and that |Q'| < k for appropriate integral 2’, y’, 
are equivalent to one another. 


Now let x, y,, be coprime integers such that M = Q(x; y») # 0. 
We can choose x}, y, SO that z} y,;—a, y,, = 1. The transformation 
(24.5.1) X= av’ tay’, Y= Yor HAY 
is unimodular and transforms Q(x, y) into Q'(x', y) with 

a’ = ax? + Wary Yot cyi = Q(x, Yn) =M. 

If we make the further unimodular transformation 
(24.5.2) g = x’ tny’, yomy’, 
where n is an integer, a’ = M is unchanged and b’ becomes 

b = b+na' = b'+nM. 
Since M + 0, we can choose n so that — |M|< 2b” < |M |. Thus we 
transform Q(x, y) by unimodular substitutions into 

Q"a", y") = Ma"? 4 2"z"y"+0"y"? 

with —|M| < 26” < |M].t 

We can now improve the results of Theorems 450 and 451, for n = 2. 
We take the latter theorem first. 


THEOREM 453. There are integers x, y, not both 0, for which 


(24.5.3) E7? < GIAJ; 
and this is true with inequality unless 
(24.5.4) E+? mw (SFA (a?+2y+y?). 


+ Areader familiar with the elements of the theory of quadratic forms will recognize 
Gauss’s method for transforming Q into a ‘reduced’ form. 
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We have 
(24.5.5) &+77 = ax*+2bay+cy* = Q(x, y), 


where 
(24.5.6) | a= oty? b= oBty8, c= +8, 

ac—b? = (adi—By)? = A? > 0. 
Then Q > 0 except when x = y = 0, and there are at most a finite 
number of integral pairs x, y for which Q is less than any given k. It 
follows that, among such integral pairs, not both 0, there is one, say 
(£o y,,), for which Q assumes a positive minimum value m. Clearly £o 
and y,, are coprime and sọ, by what we have just said, Q is equivalent 
to a form Q”, with a” =m and -m < 2b” < m. Thus (dropping the 
dashes) we may suppose that the form is 


ma 2bhay cy’, 
where -m < 2b < m. Then c > m, since otherwise x = 0, y= 1 
would give a value less than m; and 
(24.5.7) A? = mc—b? > mim = ĝm, 
so that m < (§)#A]. 
This proves (24.5.3). There can be equality throughout (24.5.7) only 


if c = m and b = 3m, in which case Q ~ m(x?+-a2y+y"). For this form 
the minimum is plainly (4} |A]. 


24.6. The best possible inequality for (éy|. Passing to the pro- 
duct |&n |, we prove 


THEOREM 454. There are integers x, y not both O for which 


(24.6.1) lén] < 571A]; 
and this is true with inequality unless 
(24.6.2) én ~ 5A |(2°+ry—y?). 


The proof is a little less straightforward than that of Theorem 453 
because we are concerned with an ‘indefinite form’. We write 


(24.6.3) En = ax?+ 2bry+cy? = Q(g, y), 
where 
(24.6.4) a= ay, 2b = a8+By, c= fô, 


4(b?—ac) = A? > 0. 


We write m for the lower bound of |Q(x, y)|, for x and y not both zero; 
we may plainly suppose that m > 0 since there is nothing to prove if 


m = 0. There may now be no pair x, y guch that |Q(x, y) | = m, but 
5591 pd 
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there must be pairs for which Q(z, y) is as near to m as we please. 

Hence we can find a coprime pair x, and y, SO that m <|M|< 2m, 
where M = Q(x, Yo). Without loss of generality we may take M > 0. 
If we transform as in § 24.5, and drop the dashes, our new quadratic 


em Q(x, y) = Mx?+ 2bay+cy?, 
where 

(24.6.5) m <M <2m, —-M<2<M 
and 

(24.6.6) 4(b2— Mc) = A? > 0. 


By the definition of m, |Q(x, y) > m for all integral pairs x, y other 
than 0, 0. Hence if, for a particular pair, Q(x, y) < m, it follows that 
Q(x,y) < -m. Now, by (24.6.5) and (24.6.6), 


2 
QO, D= <% <M <m. 


Hence c < -m and we write ( = —c >m > 0. Again 
Q1, B) = M-l-C<M-c <M-m <m 

and so M— |2b|—C < —m, that is 

(24.6.7) [2b] >M+m-C, 


If M+m-C < 0, we have C > M+m > 2m and 
A? = 4(6°+MC)>4MC > 8m? > 5m?, 
If M+m-C > 0, we have from (24.6.7) 
A? = 46?-L4MC D> (M+m—C)?+4MC 
= (M—m-+C)?+4Mm > 5m’. 
Equality can occur only if M-m+C = m and M = m, SO that 
M = ÇC =m and |b| = m. This corresponds to one or other of the 
two (equivalent) forms m(x?+-xy—y") and m(xz?—ay—y?). For these, 
|Q(1,0)| = m = 5-+4A. For all other forms, 5m? < A? and so we may 
choose Zo y,, SO that 5m? < 5M? < A2, 
This is Theorem 454. 
24.7. A theorem concerning non-homogeneous forms. We 


prove next an important theorem of Minkowski concerning non-homo- 
geneous forms 


(24.7.1) 5-P = at+By—p, y—o = yz+òy—o. 
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Tueorem 455. If E and y are homogeneous linear forms in x, y, with 
determinant \ 4 0, and pand gare real, then there are integral x, y for 
which 


(24.7.2) KE—p)Xn—0)] < 414l; 


and this is true with inequality unless 


(24.7.3) E = bu, n = dv, 06 = A, p = Of4+4), o = g+}, 


where u and v are forms with integral coefficients (and determinant 1), 
and f and g are integers. 


It will be observed that this theorem differs from all which precede 
in that we do not exclude the values x = y = 0. It would be false if 
we did not allow this possibility, for example if and y are the special 
forms of Theorem 454 and p= g = 0. 

It will be convenient to restate the theorem in a different form. The 
points in the plane ¢, ņ corresponding to integral x, y form a lattice 
A of determinant A. Two points P, Q are equivalent with respect to A 
if the vector PQ is equal to the vector from the origin to a point of A;T 
and (£-—-p, 7—o), with integral x, y, is equivalent to (-p, —c). Hence 
the theorem may be restated as 

THeorem 456. If A is a Zattice of determinant A in the plane of (€, 9), 
and @ is any given point of the plane, then there is a point equivalent to 
Q for which 
(24.7.4) lén] < gil, 
with inequalify except in the special case (24.7.3). 


In what follows we shall be concerned with three sets of variables, 
(x, y), (£, »), and (€’, 7’). We call the planes of the last two sets of 
variables 7 and 7’, 

We may suppose A = 1.{ By Theorem 450 (and a fortiori by Theorem 
454), there is a point P, of A, other than the origin, and corresponding 
to Xo Yo, for which 


(24.7.5) léon! S $ 

We may suppose z, and y,, coprime (so that Fj is ‘visible’ in the sense 
of § 3.6). Since £) and yọ satisfy (24.7.5), and are not both 0, there is 
a real positive A for which 

(24.7.6) (Ago)? + (Aly)? = 1. 


+See p. 35, It is the same thing to say that the corresponding points in the (x, y) 
plane are equivalent with respect to the fundamental lattice. 
t See the footnote to p 396. 
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We put 
(24.7.7) f= AÈ, y = Aty. 
Then the lattice A in 7 corresponds to a lattice A’ in 7’, also of deter- 
minant 1. If 0’ and Po correspond to 0 and P, then Pù, like P, is 
visible; and O'P) = 1, by (24.7.6). Thus the points of A’ on O'Po are 
spaced out at unit distances, and, since the area of the basic parallelo- 
gram of A’ is 1, the other points of A’ lie on lines parallel to O'P% 
which are at unit distances from one another. 
We denote by S’ the square whose centre is 0’ and one of whose 
sides bisects O'P} perpendicularly.t Each side of S’ is 1; S’ lies in 


the circle Etg? = OR)? = Fh, 
and 
(24.7.8) f'n) S EHn?) S 4 


at all points of 9’. 

IfA’ andB' are two points inside §’, then each component of the 
vector A'B' (measured parallel to the sides of the square) is less than 
1, so that A’ and B’ cannot be equivalent with respect to Æ. It follows 
from Theorem 42 that there is a point of 8’ equivalent to Q (the point 
of x’ corresponding to Q). The corresponding point of 7 is equivalent 
to Q, and satisfies 
(24.7.9) ifm = EISE 
This proves the main clause of Theorem 456 (or 455). 

If there is equality in (24.7.9), there must be equality in (24.7.8), So 
that |£’| = |n’| = 4. This is only possible if S? has its sides parallel 
to the coordinate axes and the point of S’ in question is at a corner. 
In this case P} must be one of the four points (+ 1, 0), (0, + 1): let us 
suppose, for example, that it is (1,0). 

The lattice A’ can be based on O'P) and O'P}, where P} is on 7’ =1. 
We may suppose, selecting Pı appropriately, that it is (c, 1), where 
0 <c <1. If the point of S equivalent to Q’ is, say, (4, 4), then 
(4—c, 4—1), ie. (4—c, —}4), is another point equivalent to Q’; and this 
can only be at a corner of S’, as it must be, if ¢ = 0. Hence P} is 
(0, 1), A’ is the fundamental lattice in 7’, and Q’, being equivalent to 
(4, 3), has coordinates 

E= fH} a= gth 
where f and g are integers. We are thus led to the exceptional case 
(24.7.3), and it is plain that in this case the sign of equality is necessary. 
+ The reader should draw a, figure. 
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24.8. Arithmetical proof of Theorem 455. We also give an arith- 
metical proof of the main clause of Theorem 455. We transform it as 
in Theorem 456, and we have to show that, given u and v, we can 
satisfy (24.7.4) with an x and a y congruent to p and y to modulus 1. 

We again suppose A = 1. As in § 24.7, there are integers £g, Yọ, which 
we may suppose coprime, for which 

|(a%y+BYo)(yto+SYo)] < $- 


We choose x, and y} SO that x) y,—2, y,, = 1. The transformation 
X= Ux’ +ayy’, Y = YoU HYY 

changes ¢ and y into forms é’ = a’2’+’y’, q = y'x'+8'y' for which 
jor"y’| = | (cero +BY 0)(¥%0+8Y0)| < + 


Hence, reverting to our original notation, we may suppose without loss 
of generality that 


(24.8.1) layi < 4. 
It follows from (24.8.1) that there is a real A for which 
A2a2-HAÀ 723? = 1; 


2 |(aa + By)(ya+dy)| < Aar + By)? +A*(ya-+ by)? 
= a+ 2bry+cy? = (x+by)?+-py*, 
for some b,c, p. The determinant of this quadratic form is, on the one 
hand, the square of that of A(av-+ By) and A~(y2-+-8y),f that is to say 1, 
and on the other the square of that of #-+by and ply, that is to say p; 
and therefore p = 1. Thus 
2|(aa+ By)(ye+dy)| < (x+by)?+y?. 
We can choose y = v (mod 1) so that |y| < 4, and then x = p (mod 1) 
so that |x+by|< 4; and then 
léni < {4 +(3)} = 4 

We leave it to the reader to discriminate the cases of equality in this 

alternative proof. 


and 


24.9. Tchebotaref’s theorem. It has been conjectured that Theo- 
rem 455 could be extended to n dimensions, with 2” in place of 4; 
but this has been proved only for n = 3 and n = 4. There is, however, 
a theorem of Tchebotaref which goes some way in this direction. 


t See (24.5.5) and (24.5.6). 
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THEOREM 457. If é, č ,..., €, are homogeneous linear form.8 in z}, 
Zos- Ly, With real coefficients and determinant A; py, po,---, Pn are real; 
and m is the lower bound of 


(€1—Pi)(S2—pe)--(En—Pa) | 
then 


(24.9.1) m <2-*I/A], 


We may suppose A = 1 andm > 0. Then, given any positive e, 
there are integers x*, a},..., £% for which 


(24.9.2) 
Iele Eoee) plste 
We put é = géi (i = 1, 2, n). 


a 
i pi 
Then ¢},..., n are linear forms in 2,—2f,...,%,—2f, with a determinant 
D whose absolute value is 


IDI = (TI ites) = +8, 
m 


and the points in £’-space corresponding to integral x form a lattice 
A’ whose determinant is of absolute value (1 —@)/m. Since 


D 


IT lé:—p:l > m, 
every point of A’ satisfies 
r éi — Pi 
M+ = | [>19 


The same inequality is satisfied by the point symmetrical about the 
origin, so that [J | &;—1|> 10 and 


(24.9.3) TĪ |ġ&2—1| = [(é2—1)(€2—1)..(€2—1)| > (1—0). 


We now prove that when ¢ and 0 are small, there is no point of A’, 
other than the origin, in the cube C’ defined by 


(24.9.4) ël < J{1-+(1—8)%. 

If there is guch a point, it satisfies 

(24.9.5) -1 <é2-1< (1-0 <1 (i=1, 2, n). 
If 

(24.9.6) g2—-1 > —(1-6)2 


for some i, then |£;2— 1|< (1—0)? for that i, and |é?— 1|< 1 for every 
eo TI 2-1 < (1—9), 
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in contradiction to (24.9.3). Hence (24.9.6) is impossible, and therefore 


-1 < 2-1 < —(1—9)}? (i = 1, 2,..., n); 
and hence 


(24.9.7) lgl < (2 —-(1—8)} < (20) @ = 1, 2 ,..., n). 


Thus every point of A’ in C’ is very near to the origin when e and @ are 
small. 

But this leads at once to a contradiction. For if (§,..., én) is a point 
of A’, then so is (Nj,..., Nén) for every integral N. If 6 is small, every 
coordinate of a lattice point in C” satisfies (24.9.7), and at least one of 
them is not 0, then plainly we can choose N so that (Néi, Nén) 
while still in C’,is at a distance at least 4 from the origin, and there- 
fore cannot satisfy (24.9.7). The contradiction shows that, as we stated, 
there is no point of A’, except the origin, in C’. 

It is now easy to complete the proof of Theorem 457. Since there 
is no point of A’, except the origin, in C’, it follows from Theorem 447 
that the volume of C’ does not exceed 

2"|D| = 2"(1—6)/m; 
and therefore that 
2°m{1+(1—8)?}#" < 2"(1—8). 


Dividing by 2”, and making 0 - 0, we obtain 


m < 2-1, 
the result of the theorem. 


24.10. A converse of Minkowski’s Theorem 446. There is a 
partial converse of Theorem 446, which we shall prove for the case 
n = 2. The result is not confined to convex regions and we therefore 
first redefine the area of a bounded region P, since the definition of 
p. 32 may no longer be applicable. 

For every p > 0, we denote by A(p) the lattice of points (pæ, py), 
where x, y take all integral values, and write gp for the number of 
points of A(p) (apart from the origin 0) which belong to the bounded 
region P. We call 
(24.10.1) V = limp?g(p) 

p70 
the area of P, if the limit exists. This definition embodies the only 
property of area which we require in what follows. It is clearly 
equivalent to any natural definition of area for elementary regions such 
as polygons, ellipses, etc. 
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We prove first 


Tueorem 458. If P is a bounded plane region with an area V which 
is less than 1, there is a Eattice of determinant 1 which has no point (except 
perhaps 0) belonging to P. 


Since P is bounded, there is a number N such that 
(24. 10.2) —-N<é<N, -N LIN 
for every point (¢, ņ) of P. Let p be any prime such that 
(24.10.3) p> Ne 

Let u be any integer and A, the lattice of points (£, y), where 

t= x a uX +-pY 
vp vp 

and X, Y take all integral values. The determinant of A, is 1. If 


Theorem 458 is false, there is a point T, belonging to both A, and P 
and not coinciding with 0. Let the coordinates of T, be 


Xu uX, +pY, 
Eu = Wp’ = a 
If X,, = 0, we have 
Npl¥ul = nul S N < vp 

by (24.10.2) and (24.10.3). It follows that Y,, = 0 and 7, is 0, contrary 
to our hypothesis. Hence X, Æ 0 and 

0 <{[X,l= vpl] < Nvp < P. 
Thus 


(24.10.4) X, Æ 0(modp). 
If T, and T, coincide, we have 
X Xy uX, +pY, = vX,+pY, 


u = 


and so 
X(u—v) = 0, u = II (modp) 


by (24.10.4). Hence the p points 


(24.10.5) TE a fo ed 
are all different. Since they all belong to P and to A(p-), it follows 
that 

= gp) 3 P. 


But this is false for large enough p, since 


pg(pt) >v < 1 
by (24.10.1). Hence Theorem 458 is true. 
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For our next result we require the idea of visible points of a lattice 
introduced in Ch. III. A point T of A(p) is visible (i.e. visible from the 
origin) if T is not 0 and if there is no point of A(p) on OT between 0 
and T. We write f(p) for the number of visible points of A(p) belonging 
to P and prove the following lemma. 


THEOREM 459: pt (p) > 55) as p > 0, 
The number of points of A(p) other than 0, whose coordinates satisfy 
(24.10.2) is (2[N/p]-+1)2—-1. 
Hence 
(24. 10. 6) fe) = 9p) = 9 (p > N) 
and 
(24, 10. 7) Flo < glp) < 9N*/p? 
for all p. 


Clearly (px,py) is a visible point of A(p) if, and only if, x, y are 
coprime. More generally, if m is the highest common factor of x and y, 
the point (px, py) is a visible point of A(mp) but not of A(kp) for any 


integral k m. Hence co 
glp) = 2y (mp). 
m= 


By Theorem 270, it follows that 


co 


Je) = 2 e(m)g(mp). 


m= 


The convergence condition of that theorem is satisfied trivially since, 
by (24.10.6), f(mp) = g(mp) = O for mp > N. Again, by Theorem 287, 


ay) 
W2) 5 m=1 m? 
and so 
(2410.8) Po gig = > Eep mogen) V} 
m=1 


Now let ¢ > 0. By (24.10.1), there is a number p, = p,(€) such that 
|mp*g(mp)—V| < € 
whenever mp < p,. Again, by (24.10.7), 


|m2p*g(mp)—V| < 9N?+V 
Ee 
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for all m. If we write M = [p,/p], we have, by (24.10.8), 


y ea we 
2 ae pues 
A-ya] <2 apt OPED) > ma 
en? Q9N?-+LV 
op eer 


if p is small enough to make 


M = FATA > (9N? V )/e. 
Since ¢ is arbitrary, Theorem 459 follows at once. 

We can now show that the condition V < 1 of Theorem 458 can be 
relaxed if we confine our result to regions of a certain special form. 
We say that the bounded region P is a star region provided that (i) 0 
belongs to P, (ii) P has an area V defined by (24.10.1), and (iii) if T is 
any point of P, then 80 is every point of OT between 0 and T. Every 
convex region containing 0 is a star region; but there are star regions 
which are not convex. We can now prove 


THEOREM 460. If P is a star region, symmetrical about 0 and of area 
V < 2¢(2) = ir? there is a lattice of determinant 1 which has no point 
(except possibly 0) in P. 


We use the same notation and argument as in the proof of Theorem 
458. If Theorem 460 is false, there is a T, different from 0, belonging 
to A,, and to P. 

If T, is not a visible point of A(p~-!), we have m > 1, where m is the 
highest common factor of X, and uX,,+pY,,. By (24.10.4), p { X„ and 
sop f m. Hence m Y,. If we write X, = mXj,, Y, = mY, the num- 
bers X, and uX, +pY,, are coprime. Thus the point T, whose coordi- 


te + t + 

nates are X, uX)+pY’, 
P p” 

belongs to A, and is a visible point of A(p~!). But 7", lies on OT, and 
So belongs to the star region P. Hence, if 7', is not visible, we may 
replace it by a visible point. 

Now P contains the p points 
(24.10.9) To al pts 
all visible points of A(p-t), all different (as before) and none coinciding 
with 0. Since P is symmetrical about 0, P also contains the p points 
(24.10.10) Tio Tred Toi 
where T, is the point (—é,, —7,). All these p points are visible points 
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of A(p-4), all are different and none is 0. Now T, and T, cannot coin- 
cide (for then each would be 0). Again, if u + and T, and T, coincide, 


we have ey uX,+pY, = —0X,—pY¥,, 
(u-V)X, = 0, X, =0 oru =v (modp), 


u 
both impossible. Hence the 2p points listed in (24.10.9) and (24.10.10) 
are all different, all visible points of A(p—) and all belong to P so that 


(24.10.11) f(p-*) > 2p. 
But, by Theorem 459, as p > œ, 
P fp) > 6V in < 2 
by hypothesis, and SO (24.10.11) is false for large enough p. Theorem 
460 follows. 


The above proofs of Theorems 458 and 460 extend at once to n 
dimensions. In Theorem 460, {(2) is replaced by ¢(n). 


NOTES ON CHAPTER XXIV 


§ 24.1. Minkowski’s writings on the geometry of numbers are contained in his 
books Geomeirie der Zahlen and Diophantische Approximationen, already referred 
to in the note on $3.10, and in a number of papers reprinted in his Gesammelte 
Abhandlungen (Leipzig, 1911). The fundamental theorem was first stated and 
proved in a paper of 1891 (Gesammelte Abhandlungen, i. 255). There is a very 
full account of the history and bibliography of the subject, up to 1936, in Koksma, 
chs. 2 and 3, and a survey of recent progress by Davenport in Proc, International 
Congress Math. (Cambridge, Mass., 1950), 1 (1952), 166-74. 

Siegel [Acta Math. 65 (1935), 307-23] has shown that if V is the volume of 
a convex and symmetrical region R containing no lattice point but 0, then 

2 = V+ V= SII 
where each J is a multiple intogral over R. This formula makes Minkowski’s 
theorem evident. 

Minkowski (Geometrie der Zahlen, 211-19) proved a further theorem which 
includes and goes beyond thc fundamental theorem. We suppose R convex and 
symmetrical, and write AR for R magnified linearly about 0 by a factor A. We 
define A,, Ags.. A, as follows: A, is the least À for which AF has a lattice point 
P, on its boundary; À, the least for which ÀR has a lattice point P,, not collinear 
with 0 and Pi on its boundary; Ag the least for which ÀR has a lattice point Es 
not coplanar with 0, Py and P,, on its boundary; and sO on. Then 


0< Ay <Ag L- <An 


(A,, for example, being equal to Ài if ÀR has a second lattice point, not collinear 
with 0 and Po on its boundary); and 


AyAgeAn V < 2”. 
The fundamental theorem is equivalent to À?V < 2", Davenport [Quarterly 


Journal of Math. (Oxford), 10 (1939), 117-21] has given a short proof of the 
more general theorem. 
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§ 24.2. All these applications of the fundamental theorem were made by 
Minkowski. 

Siegel, Math. Annalen, 87 (1922), 36-8, gave an analytic proof of Theorem 448: 
see also Mordell, ibid. 103 (1930), 38-47. 

Hajós, Math. Zeitschrift, 47 (1941), 427-67, has proved an interesting con- 
jecture of Minkowski concerning the ‘boundary case’ of Theorem 448. Suppose 
that A = 1, so that there are integral 2, 2 ,..., Zp such that lé] <1 for r=], 
2,.., n, Can the a, be chosen so that lé,| < 1 for every r ? Minkowski’s con- 
jecture, now established by Hajós, was that this is true except when the A can 
be reduced, by a change of order and a unimodular substitution, to the forms 


51 = Tis Éa = Oty te os En = Una MF Ang tate bop 


The conjecture had been proved before only for n < 7. 

The first general results concerning the minima of definite quadratic forms 
were found by Hermite in 1847 (Œuvres, i, 100 et seq.): these are not quite so 
sharp as Minkowski's. 

§ 24.3. The first proof of this character was found by Hurwitz, Göttinger Nach- 
richten (1897), 139-45, and is reproduced in Landau, Algebraische Zahlen, 34-40. 
The proof was afterwards simplified by Weber and Wellstein, Math. Annalen, 
73 (1912), 275-85, Mordell, Journal London Math. Soc. 8 (1933), 179-82, and 
Rado, ibid. 9 (1934), 164-5 and 10 (1933), 115. The proof given here is substan- 
tially Rado’s (reduced to two dimensions). 

§ 24.5. Theorem 453 is in Gauss, D.A., § 171. The corresponding results for 
forms in M variables are known only for n < 8: see Koksma, 24, and Mordell, 
Journal London Math. Soc. 19 (1944), 3-6. 

§ 24.6. Theorem 454 was first proved by Korkine and Zolotareff, Math. Annalen 
6 (1873), 366-89 (369). Our proof is due to Professor Davenport. See Macbeath, 
Journal London Math. Soc. 22 (1947), 261-2, for another simple proof. There is 
a close connexion between Theorems 193 and 454. 

Theorem 454 is the first of a gerieg of theorems, due mainly to Markoff, of 
which there is a systematic account in Dickson, Studies, ch. 7. If én is not 
equivalent either to (24.6.2) or to 


(a) SIJAN? + 2ay—y?), 

then lén] < 8-4/4 

for appropriate 2, y; if it is not equivalent either to (24.6.2), to (a), or to 
(b) (221)-4/A|(5x* + Lay ~ 5y?), 

then lén] < 5(221)-#A]; 


and so on. The numbers on the right of these inequalities are 


(c) m(9m?—4)-4, 
where m is one of the ‘Markoff numbers’ 1, 2, 5, 13, 29,.,,; and the numbers (c) 


have the limit $. See Cassels, Annals of Math. 50 (1949), 676-85 for a proof of 
these theorems. 


There is a similar set of theorems associated with rational approximations to 


an irrational é, of which the simplest is Theorem 193: see §§ 11.8-10, and Koksma, 
31-33. 


Davenport /PT-0C. London Math. Soc. (2) 44 (1988), 412-31, and Journal 
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London Math, Soc. 16 (1941), 98-101] has solved the corresponding problem for 
Sneed eens lié él < 44l 


unless éi éé ~ 4 [I (£1 +8: + 02), 

where the product extends over the roots ĝ of §?+ §2—2@— 1 = 0. Mordell, in 
Journal London Math. Soc. 17 (1942), 107-15, and a series of subsequent papers 
in the Journal and Proceedings, has obtained the best possible inequality for the 
minimum of a general binary cubic form with given determinant, and has shown 
how Davenport’s result Can be deduced from it; and this has been the starting- 
point for a considerable body of work, by Mordell, Mahler, and Davenport, on 
lattice points in non-convex regions. 

The corresponding problem for n > 3 has not yet been solved. 

Minkowski [Göttinger Nachrichten (1904), 311-35; Gesammelte Abhandlungen, ii. 
3-42] found the best possible result for él + Jé + lésl viz. 

lél+ [f+ [| < (Aap. 
No simple proof of this result is known, nor any corresponding result with n > 3. 

§§ 24.7-8. Minkowski proved Theorem 455 in Math. Annalen, 54 (1904), 108-14 
(Gesammelte Abhandlungen, i. 320-56, and Diophantische Approximationen, 42-7). 
The proof in § 24.7 is due to Heilbronn and that in § 24.8 to Landau, Journal für 
Math. 165 (1931), 1-3: the two proofs, though very different in form, are based 
on the same idea. Davenport [Acta Math. 80 (1948), 65-95] solved the corre- 
sponding problem for indefinite ternary quadratic forms. 

§ 24.9. The conjecture mentioned at the beginning of this section is usually 
attributed to Minkowski, but Dyson [Annals of Math. 49 (1948), 82-109] remarks 
that he can find no reference to it in Minkowski’s published work. Remak /Math. 
Zeitschrift, 17 (1923), 1-34 and 18 (1923), 173-200] proved the truth of the con- 
jecture for n = 3 and Dyson [loc. cit.] its truth for n = 4. Davenport [Journal 
London Math. Soc. 14 (1939), 47-51] gave a much shorter proof for n = 3. 

It is easy to prove the truth of the conjecture when the coefficients of the 
forms are rational. 

Tchebotaref’s theorem appeared in Bulletin Univ. Kasan (2) 94 (1934), Heft 7, 
3-16; the proof is reproduced in Zentralblatt für Math. 18 (1938), 110-11. Mordel1 
{Vierteljahreschrift d. Naturforschenden Ces. in Zürich, 85 (1940), 47-50] has shown 
that the result may be sharpened a little. See also Davenport, Journal London 
Math. Soc. 21 (1946), 28-34. 

§ 24.10. Minkowski [Gesammelte Abhandlungen (Leipzig, 1911), i. 265,270, 277] 
first conjectured the n-dimensional generalizations of Theorems 458 and 460 and 
proved the latter for the n-dimensional sphere [loc. cit. ii. 95]. The first proof 
of the general theorems Was given by Hlawka /Math. Zeitschrift, 49 (1944), 285- 
312], Our proof is due to Rogers [Annals of Math. 48 (1947), 994-1002 and 
Nature 159 (1947), 104-5]. See also Cassels, Broc. Cambridge Phil. Soc. 49 (1953), 
165-6, for a simple proof of Theorem 460 and Rogers, Proc, London Math. Soc. (3) 
6 (1956), 305-20, and Schmidt; Monatsh. Math. 60 (1956), 1-10 and 110-13, for 
improvements of Hlawka’s results. 
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Tue references give the section and page where the definition of the 
symbol in question is to be found. We include all symbols which occur 
frequently in standard senses, but not symbols which, like S(m,n) in 
$5.6, are used only in particular sections. 

Symbols in the list are sometimes also used temporarily for other 
purposes, as is y in § 3.11 and elsewhere. 


General analytical symbols 


0, 0, ~, <, =, |f], A (unspecified § 1.6 p-7 
constant) 

min(x, y), max(z, y) § 5.1 p. 48 

(7) = eTit § 5.6 p. 54 

[z] § 6.11 p. 74 

(x), 2 § 113 p. 156 

[do a,..., a] (continued fraction) § 10.1 p. 129 

Pps Un (convergents) § 10.2 p. 130 

an $$ 10.5, 10.9 pp. 133, 139 

In $$ 10.7, 10.9 pp. 137, 140 


Symbols of divisibility, congruence, etc. 


bla, bfa § 1.1 p. 1 

(a, b), (a, 6,..., k) g 2.9 p. 20 
{a,b} § 5.1 p. 48 
x = a (modm), æ Æ a (modm) § 5.2 p. 49 
f(z) = g(x) (mod m) § 7.2 p. 82 
g(x) | f(x) (mod m) § 7.3 p. 83 
= (modm), ? (modm) § 7.8 p. 89 
k(1) § 122 p. 178 
k(i) § 12.2 p. 179 
k(p) § 12.2 p. 179 
k(#) § 141 p.204 


Bla, B f «, a = R (mody) [in k(i) and other fields] 

§§. 12.6 (p. 182), 12.9 (p. 188), 14.4 (p. 208), 15.2 (p. 219) 
€ (unity) §§ 12.4 (p. 181), 12.6 (p. 182), 14.4 (p. 208) 
Na (norm) §§ 12.6 (p. 182), 12.9 (p. 187), 14.4 (p. 208) 
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II f(p), If) § 5.1 p. 48 (fn.) 
p pin 
aRp,aNp, o § 6.5 pp. 67-8 
Special numbers and functions 
n(x) § 1.5 p. 6 
Pn § 1.5 p. 6 
Fa, (Fermat number) § 2.4 p. 14 
M, (Mersenne number) § 2.5 p. 16 
ðn (Farey series) § 3.1 p. 23 
y (Euler’s constant) §§ 4.2, 18.2 pp. 39 (fin.), 264 (f.n.) 
pm) § 5.5 p. 52 
co(n) § 5.6 p. 55 
u(n) § 16.3 p. 234 
d(n), oln), a(n) § 16.7 p. 238 
r(n), di(n), daln) § 16.9 pp. 240-1 
x(n) § 16.9 p. 240 
g(s § 17.2 p. 245 
A(n) § 17.7 p. 253 
p(n) § 19.2 p. 273 
g(k), G(k) § 20.1 p. 298 
v(k) § 21.7 p. 325 
P(k, 9) § 21.9 pp. 328-9 
B(x), p(x) § 22.1 p. 340 
U(x) § 22.1 p. 340 
w(n), Q(n) § 22.10 p. 354 
Words 


We add references to the definitions of a small number of words and 
phrases which a reader may find difficulty in tracing because they do 
not occur in the headings of sections. 


standard form of n § 1.2 p. 2 
of the same order of magnitude § 1.6 p.7 
asymptotically equivalent, asymptotic to § 1.6 p.8 
almost all (integers) § 1.6 p. 8 
almost all (real numbers) § 9.10 p. 122 
quadratfrei § 2.6 p. 16 
highest common divisor § 2.9 p. 20 


unimodular transformation § 3.6 p. 28 
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least common multiple 
coprime 

multiplicative function 
primitive root of unity 
a belongs to d (mod m) 
primitive root of m 
minimal residue (mod m) 
Euclidean number 
Euclidean construction 
algebraic field 

simple field 

Euclidean field 

linear independence of numbers 


§ 5.1 
§ 5.1 
§ 5.5 
§ 5.6 
§ 6.8 
§ 6.8 
§ 6.11 
§ 11.5 
§ 11.5 
$ 14.1 
§ 14.7 
§ 14.7 
§ 23.4 
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Bachet, 115-17, 202, 315. 
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~ Bernoulli, 90, 91, 202, 245. 
-~Bernstein, 168, 177. 
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Bécher, 397. 
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Bohl, 393. 
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Carmichael, 11. 
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Champernowne, 128. 

Cherves, 153. 

Chatland, 217. 

Chen, 337. 

Cherwell (see Lindemann, 
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Chrystal, 153. 

Cipolla, 81. 
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Copeland, 128. 
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374. 


«- Coxeter, 22. 
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Gegenbauer, 272. 

Gelfond, 47, 176, 177. 

Gérardin, 203, 339. 

Gillies, 22. 

Gleisher, 106, 316, 373. 

Gloden, 338. 

Goldbach, 19, 22. 

Goldberg, 81. 

Grace, 301, 315. 

Grandjot, 62. 

Gronwall, 272. 

Grunert, 128. 

Gupta, 289, 295. 

Gwyther, 295. 


Hadamard, 11, 374. 

Hajós, 37, 412. 

Hall, 373. 

Hardy, 106, 159, 168, 259, 
272, 289, 296, 316, 335, 
336, 338. 373, 374. 393. 

H aros, 8386. 

Hasse, 22. 

Hausdorff, 128. 

Heaslet, 316. 

Heath, 42, 43, 47, 201. 

Hecke, 22, 93, 159. 

Heilbronn, vii, 212, 213, 
217, 336, 413. 

Hermite, 47, 177, 315, 412. 

Hilbert, 177, 298, 315, 335, 
336. 

Hlawka, 413. 

Hobson, 128, 176. 

Hölder, 243. 
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22. 


420 


Ingham, 11, 22, 232, 259, 
373. 
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